Yesterday
Secret
Mid Level Career (5+ yrs experience)
IT - Software
Tewksbury, MA (Off-Site/Hybrid)
Architect and maintain Splunk ITSI modules including glass tables, KPI base searches, correlation searches, notable events, and service definitions.
Design service trees and entity models to reflect business-critical services and dependencies.
Monitoring & Analytics:
Implement event aggregation, adaptive thresholding, and noise reduction strategies.
Develop advanced correlation rules to detect anomalies and reduce MTTD and MTTR.
Integration & Automation:
Integrate ITSI with external systems like CMDBs (e.g., ServiceNow), APM tools (e.g., Dynatrace, AppDynamics), and ticketing systems.
Use REST APIs and modular inputs for data onboarding and automation.
Visualization & Reporting:
Build custom dashboards, drilldowns, and service health scores.
Create deep dives and episode reviews for incident analysis.
Collaboration & Enablement:
Work closely with IT Ops, DevOps, SREs, and business stakeholders.
Provide training, documentation, and support to internal teams.
Performance & Reliability:
Participate in incident response, root cause analysis, and proactive performance monitoring.
Optimize system performance and ensure high availability of observability solutions.
Design service trees and entity models to reflect business-critical services and dependencies.
Monitoring & Analytics:
Implement event aggregation, adaptive thresholding, and noise reduction strategies.
Develop advanced correlation rules to detect anomalies and reduce MTTD and MTTR.
Integration & Automation:
Integrate ITSI with external systems like CMDBs (e.g., ServiceNow), APM tools (e.g., Dynatrace, AppDynamics), and ticketing systems.
Use REST APIs and modular inputs for data onboarding and automation.
Visualization & Reporting:
Build custom dashboards, drilldowns, and service health scores.
Create deep dives and episode reviews for incident analysis.
Collaboration & Enablement:
Work closely with IT Ops, DevOps, SREs, and business stakeholders.
Provide training, documentation, and support to internal teams.
Performance & Reliability:
Participate in incident response, root cause analysis, and proactive performance monitoring.
Optimize system performance and ensure high availability of observability solutions.
group id: 10105424
Accelerating IT transformation in the public sector