Cloud Security Newsletter
Posts
🚨$1.73B Veeam–Securiti AI Deal + F5 Zero-Day Risk: Reality of Building an AI-Native SOC Architecture

🚨$1.73B Veeam–Securiti AI Deal + F5 Zero-Day Risk: Reality of Building an AI-Native SOC Architecture

This week's newsletter covers critical infrastructure threats including the F5 BIG-IP breach enforcement, AWS US-East-1 outage resilience lessons, and AI security vulnerabilities. We feature Ariful Huq from Exaforce on building AI-native SOC platforms beyond traditional SIEM architectures, exploring data lake design, detection engineering at scale, and the evolution of security operations teams.

Shilpi Bhattacharjee
October 23, 2025

Hello from the Cloud-verse!

This week’s Cloud Security Newsletter Topic we cover - The Vibe Coding Trap: Why AI SOC Requires More Than LLM Prompts (continue reading)

This image was generated by AI. It's still experimental, so it might not be a perfect match!

Incase, this is your 1st Cloud Security Newsletter! You are in good company!
You are reading this issue along with your friends and colleagues from companies like Netflix, Citi, JP Morgan, Linkedin, Reddit, Github, Gitlab, CapitalOne, Robinhood, HSBC, British Airways, Airbnb, Block, Booking Inc & more who subscribe to this newsletter, who like you want to learn what’s new with Cloud Security each week from their industry peers like many others who listen to Cloud Security Podcast & AI Security Podcast every week.

Welcome to this week’s Cloud Security Newsletter

This week brings critical enforcement deadlines for the F5 breach, a major AWS outage that tested multi-region resilience, and groundbreaking research showing how prompt injection can lead to remote code execution in AI agents.

We're joined by Ariful Huq, CEO of Exaforce and former Palo Alto Networks veteran, who shares hard-earned lessons from building an AI-native SOC platform from first principles tackling everything from data architecture to detection engineering and the evolution of security teams in an AI-driven future.

📰 TL;DR for Busy Readers

$1.73B Data Consolidation: Veeam to acquire Securiti AI in a ~$1.73 billion cash-and-stock deal. Data resilience + DSPM/privacy/AI-trust are converging. Expect backup inventories and multi-cloud data maps to unify.
F5 Enforcement Starts Now (Oct 22):
CISA ED-26-01 deadlines in effect. Treat BIG-IP as Tier-0. Remove internet-exposed mgmt, apply Oct QSN patches, add TMSH/API anomaly detections.
Expert Insights from the Cloud Security Podcast Episode
- Data > SIEM:
  Legacy SIEM + AI “bolt-ons” fail without unified logs + config + code + permissions context.
- SaaS Blind Spots:
  GitHub, Snowflake, and Workspace lack native detections — require domain-specific engineering.
- Cost Reality:
  Cloud telemetry (100× traditional logs) breaks ingestion-priced SIEMs; separate hot vs cold data for speed + economy.
- SOC Evolution:
  Future teams = full-stack security engineers; AI handles triage, humans own investigation.

📰 THIS WEEK'S SECURITY HEADLINES

1. Veeam Acquires Securiti AI for $1.73B: Data Resilience Meets DSPM

Veeam announced plans to acquire DSPM and AI data governance vendor Securiti AI for $1.725 billion in a cash and stock deal expected to close in Q4 2025. The acquisition aims to unify backup and disaster recovery capabilities with data discovery, privacy management, DSPM, and AI trust functions.

Why this matters: This represents major consolidation at the intersection of data resilience and data security posture management. Enterprise security leaders should anticipate tighter integrations between backup inventories and multi-cloud data mapping across S3, Azure Blob, GCS, and SaaS platforms. For organizations running Veeam today, this creates an opportunity to converge policies around retention, privacy, data sovereignty, and response orchestration that spans both backup restoration and data-level containment. Key questions for your Veeam roadmap discussions: How will auto-classification work for backed-up data? What cross-cloud toxic data flow detection capabilities will emerge? How will immutable restore capabilities extend to AI model rollback scenarios?

This consolidation signals that data-centric security is moving beyond point solutions toward integrated platforms that understand both where data lives and how to recover it securely.

Sources: Bloomberg, Veeam press release

2. CISA Emergency Directive on F5 Breach Reaches Critical Enforcement Deadlines

Following F5's disclosure that a nation-state actor stole BIG-IP source code and internal vulnerability data, CISA issued Emergency Directive 26-01 ordering federal agencies to inventory and patch or replace affected F5 devices. Enforcement deadlines began October 22, with new reporting highlighting hundreds of thousands of internet-reachable BIG-IP instances at risk.

Why this matters: Source code and vulnerability intelligence theft fundamentally changes the threat landscape for these devices. This isn't just about known CVEs attackers now have the blueprint to craft 0-day exploits against ADC and WAF gateways that front critical applications in both on-premises and hybrid cloud environments. For enterprise security architects, BIG-IP and related F5 infrastructure should be reclassified as Tier-0 assets requiring immediate action: restrict management plane access, enforce MFA, remove any internet-exposed management interfaces, apply F5's October Quarterly Security Notification packages, and implement CISA ED 26-01 hardening actions including asset discovery, configuration review, and log analysis. Additionally, security teams should add custom detections for anomalous TMSH and API activity, and hunt for any credential or API key leakage patterns referenced in the advisories going back at least 12 months.

Sources: CISA Emergency Directive & Alert, F5 Advisory

3 - AWS US-East-1 Outage: A Multi-Region Resilience Wake-Up Call

A multi-hour incident in AWS's US-East-1 region on October 20 cascaded across load balancers and dependent services, disrupting thousands of major consumer and government applications. AWS attributed the trigger to internal infrastructure issues, and service has been restored.

Why this matters: While not a security breach, this outage serves as a critical resilience test for cloud architects. The incident exposed dependencies that many organizations didn't realize existed particularly around authentication services, payment processing, and messaging infrastructure that assumed US-East-1 availability. Key architectural reviews for your team: Revisit cell and region isolation strategies for critical paths, validate control-plane dependencies including Route 53 health checks, ELB/ALB configurations, and IAM token lifetime assumptions. Most importantly, prove graceful degradation and blast-radius limits through game-day exercises, and ensure incident communication automations don't depend on the impaired region. This outage reinforces that multi-region architecture isn't just about disaster recovery it's about maintaining operations when a primary region experiences prolonged disruption.

Sources: The Guardian, AWS Health Dashboard

4 - MuddyWater APT Targets 100+ Government Entities with Phoenix Backdoor

New reporting details a broad phishing campaign from MuddyWater (Iranian state-sponsored group) using compromised mailboxes and VPN infrastructure to deliver macro-enabled payloads containing the Phoenix backdoor. The campaign targets government organizations across the Middle East and Africa.

Why this matters: Organizations that federate identities to Office 365 or Google Workspace and operate hybrid workloads in MEA regions should expect living-off-the-land persistence techniques and mailbox-to-mailbox lateral movement. The campaign demonstrates sophisticated use of compromised infrastructure to evade traditional perimeter defenses. Immediate hardening steps: Tighten conditional access policies, disable legacy authentication protocols, restrict macro execution via Attack Surface Reduction (ASR) rules, and enrich detection logic with VPN exit node indicators and suspicious OAuth token grant patterns. This campaign also highlights the importance of monitoring for anomalous authentication patterns and mailbox rules that could indicate compromise.

Source: Dark Reading

5 - CISA Adds Five Actively Exploited Vulnerabilities to KEV Catalog

CISA updated the Known Exploited Vulnerabilities catalog on October 20, highlighting multiple bugs under active exploitation in the wild. Separate reporting confirms exploitation of a patched Windows SMB vulnerability.

Why this matters: KEV-listed vulnerabilities must be prioritized over CVSS-only scoring approaches. These represent confirmed threats that attackers are actively weaponizing. Security teams should integrate KEV feeds into vulnerability management programs to auto-create SLAs and prioritize remediation workflows. Where immediate patching isn't feasible, verify compensating controls are in place particularly network segmentation, SMB signing enforcement, and EDR coverage on domain controllers and file servers that underpin cloud synchronization services. The Windows SMB exploitation is especially concerning given how critical these services are to hybrid cloud authentication and file sharing architectures.

Source: CISA Alert

6- Healthcare Cyberattack Forces Massachusetts Hospitals to Divert Patients

A cyber incident on October 20 forced two Massachusetts hospitals to take IT and radiology systems offline and divert ambulance traffic. Early indicators point to ransomware as the attack vector.

Why this matters: This incident underscores that clinical systems including PACS/RIS and imaging workstations alongside affiliate and third-party network connections represent critical weak points in healthcare infrastructure. For security leaders supporting healthcare workloads, this reinforces the need to: isolate imaging networks from general IT infrastructure, require privileged access workstations for administrative functions, and rehearse downtime procedures including cloud-hosted EHR failover capabilities. The attack also highlights how ransomware continues to target healthcare organizations where operational disruption directly impacts patient care, making these organizations more likely to pay ransoms under duress.

Source: Bank Info Security

7 - AI Security Research: Prompt Injection Leads to Remote Code Execution

Trail of Bits published research demonstrating complete attack chains from prompt injection to remote code execution in LLM agent systems through tool use and environment bridges. The research coincided with commentary from OpenAI's CISO emphasizing risk concerns for newly launched agent features.

Why this matters: Organizations piloting agent-assisted SecOps or internal copilots with cloud tool access must treat these systems as untrusted execution environments similar to web browsers. Security teams should: constrain agent tools with least-privilege access and time-boxed tokens, implement out-of-process sandboxes, add content provenance checks including URL/domain allowlists and HTML/script stripping, enforce model-side policy guards, log all tool invocations comprehensively, and bind guardrails to secrets managers with no inline keys. This research demonstrates that AI agents aren't just productivity tools they're potential attack surfaces that require the same security rigor as any privileged system with access to production environments.

Sources: Trail of Bits Blog, Simon Willison's commentary

8 - AWS Releases Amazon Corretto Quarterly Security Patches

AWS released Corretto (OpenJDK) quarterly security and critical updates across all LTS versions (25, 21, 17, 11, 8). Many Java-based services including Lambda functions and containerized microservices pin these runtimes as base dependencies.

Why this matters: Java supply-chain risk frequently enters through base container images and managed runtime environments. Security and DevOps teams should ensure CI/CD pipelines rebuild container images against the updated Corretto versions, rotate JDK installations in Elastic Beanstalk, EKS, and ECS environments, and redeploy Lambda functions using updated layers. This seemingly routine patching cycle represents a critical supply-chain security control especially given how widely Java runtimes are deployed across enterprise cloud workloads.

Source: AWS "What's New" (Corretto)

🎯 Cloud Security Topic of the Week:

Building AI-Native SOC Platforms: Why Traditional SIEM + AI Bolt-Ons Fall Short

“Can you simply use Claude Code or any advanced LLM to build an AI SOC?“
This question reflects broader confusion in the market about what it truly takes to operationalize AI in security operations. This week, we explore the architectural and operational realities of building AI-native SOC capabilities lessons that challenge conventional wisdom about "bolting AI onto existing tools."

Featured Experts This Week 🎤

Ariful Huq - CEO and Co-founder, Exaforce
Ashish Rajan - CISO | Co-Host AI Security Podcast , Host of Cloud Security Podcast

Definitions and Core Concepts 📚

Before diving into our insights, let's clarify some key terms:

AI-Native SOC: A security operations center built from first principles with AI capabilities embedded at every layer from data ingestion to detection, triage, investigation, and response rather than AI features added to existing SIEM platforms.
DSPM (Data Security Posture Management): A category of security tools that discover, classify, and monitor sensitive data across multi-cloud and SaaS environments, providing visibility into data exposure, access patterns, and compliance risks.
SIEM (Security Information and Event Management): Traditional platforms that aggregate logs and events for correlation and analysis. Legacy SIEMs primarily process event data without broader context like configuration, code, or business logic.
Agentic SOC: Security operations platforms that use autonomous AI agents to perform multi-step security tasks including detection, triage, investigation, and response with minimal human intervention.
Detection Engineering: The practice of creating, testing, and maintaining threat detection logic including rules, behavioral analytics, and machine learning models to identify security incidents across infrastructure and applications.

This week's issue is sponsored by Exaforce

Exaforce transforms how security teams operate by delivering enterprise-grade SOC capabilities without increasing headcount.

Powered by agentic, multi-model AI and advanced data exploration capabilities, the Exaforce platform automates detection, triage, investigation, and response with expert analyst-level accuracy at machine speed.

Cut false positives by up to 80%, reduce MTTR by 70%, and lower costs while strengthening coverage across IaaS, SaaS, code, and identity.

Build or scale your SOC in hours, not months, and give your team the intelligence edge it deserves.

📤 See Agentic SOC Platform in Action

💡Our Insights from this Practitioner 🔍

The Vibe Coding Trap: Why AI SOC Requires More Than LLM Prompts(Full Episode here)

"Can I just use Claude Code to build an AI SOC?" This question surfaces constantly in enterprise security discussions, fueled by impressive demos of LLMs generating code and solving problems. Ariful Huq, who has spent the past two years building Exaforce's AI-native SOC platform, offers a nuanced perspective that every security leader evaluating AI capabilities should understand.

"I think this technology is incredibly powerful," Ariful begins. "It's given me superpowers as a product person. Just over the weekend, I was building session hijacking demos and exploring APIs things I wouldn't have done five years ago without a coding background." However, the critical distinction emerges when discussing production SOC platforms versus exploratory tooling: "If you're starting from scratch and thinking about building a data platform with everything on top of it from a management and resourcing perspective, that might not be worthwhile."

The core issue isn't whether LLMs can generate useful code (they can), but whether organizations understand the full stack required for production security operations at scale. Traditional approaches that "bolt on" AI capabilities to existing SIEM architectures encounter fundamental limitations rooted in data architecture, context, and operational maturity.

1️⃣ First Principles: Starting With Data Architecture, Not Detection Rules

Drawing an analogy to autonomous vehicles, Ariful explains why point solutions fall short: "If you look at Tesla and Waymo, they're not building bolt-on autonomous vehicles. They've built the infrastructure, the cameras, the data collection, the training. You can see they're making the most progress because they own the underlying platform."

This same principle applies to AI-native SOC platforms. "I think the bolt-on approach is tough. If you're really looking for good outcomes, you really need to think about an approach starting from first principles with the data," Ariful emphasizes. "It's well more than just logs and event data, it's config, code context, bringing it all together. I think it's going well beyond what SIEMs are capable of."

The data challenge manifests in several dimensions that enterprise SOC teams face today:

Volume Explosion from Cloud Infrastructure: At Exaforce, analyzing their own security telemetry revealed that IaaS logs specifically from public cloud platforms generate 100 times the volume of all other data sources combined. This astronomical scale puts tremendous pressure on architectural choices made for traditional on-premises environments. Legacy SIEM architectures that charge based on data ingestion become prohibitively expensive when cloud workloads generate petabytes of security-relevant telemetry.

The Real-Time Processing Tax: Security operations fundamentally differs from other data analytics use cases because SOC teams require real-time threat detection and response. Ariful highlights a critical insight about data lake economics: "If you're building a SIEM on top of Snowflake, every time you run a query or detection, you're getting charged. The real-time data processing is the critical thing here."

This continuous processing requirement drove Exaforce to architect a hybrid approach: hot data in memory for real-time querying, combined with cold storage leveraging Snowflake and Apache Iceberg on S3 for historical analysis. "We had to decouple ourselves to leverage technologies like Snowflake for what they're good at, while not getting charged for every real-time detection query," Ariful explains.

One customer example crystallizes the problem: they stored all cloud logs in S3 using Athena for queries resulting in 50 minutes to one hour query times. The cost savings from avoiding SIEM ingestion fees were completely negated by operational paralysis. Another customer tried sending all cloud logs to their existing SIEM, watching costs skyrocket before pulling back and seeking augmentation technologies to handle cloud telemetry separately.

2️⃣ Context Is Everything: Why Log Data Alone Fails AI Systems

The most common mistake organizations make when attempting to build AI SOC capabilities centers on data completeness. Traditional SIEMs excel at processing logs and events, but AI systems require substantially more context to produce reliable, actionable results.

Ariful illustrates this with a concrete example: "Let's say you're trying to do insider threat detection. One behavior is somebody without edit permissions copying files and making them public. To build that detection, you need event data about the action, config information on the resource, and permission data to determine if they had edit rights."

This pattern repeats across cloud security use cases. Investigating anomalous S3 bucket access requires understanding: Who performed the action? What IAM role or user identity was assumed? Who provisioned the S3 bucket originally? What sensitivity classification applies to the objects? Which specific keys were accessed?

"You have to give LLMs the right context. You can remove the guesswork," Ariful explains, drawing a human analogy. "If you ask me a vague question, my thought process could go many different directions. But if you remove the guesswork and give me preciseness plus reasoning freedom, you'll get consistent answers even between different people. That's exactly what we're trying to do with LLMs to remove guesswork by providing context from logs, config, code, and business understanding."

This insight proved pivotal in Exaforce's development journey. In their first year, they started with third-party detections and leveraged LLMs for triage. "The results were unpredictable, not precise, not what we expected," Ariful recalls. "It really boiled down to the data. We needed config context, permission information, code context for GitHub not just event logs."

3️⃣ Detection Engineering at Scale: The SaaS Blind Spot

Modern enterprises operate across dozens of SaaS platforms: GitHub for code, Snowflake for analytics, Google Workspace for collaboration, OpenAI for AI capabilities. A critical gap emerges: none of these platforms provide native threat detection capabilities.

"GitHub has no native threat detection," Ariful points out. "If you have a personal access token or SSH key compromised, you need your own detections to figure that out." This reality creates a detection coverage gap that traditional endpoint and email security tools don't address those problems are largely solved by CrowdStrike, Sentinel One, and mature email security providers.

Exaforce focused detection engineering efforts specifically on SaaS platforms, but the work required goes far beyond writing simple correlation rules. "You need domain-specific understanding of data. You almost need to build domain-specific knowledge by understanding events, resources, and platform intricacies for every single data source. GWS has its own intricacies. GitHub has its own intricacies."

Detection engineering at this level requires understanding platform-specific concepts like GitHub personal access tokens, Google Workspace OAuth scopes, and Snowflake account privileges. Teams must then build statistical models on top of this domain knowledge to identify anomalous patterns. "You have to figure out what is potentially anomalous, build statistical models, and determine thresholds," Ariful explains. "It's hard to just build rule-based detections where step one happens, step two happens, fire alert. It's much more complex. You need anomaly detection capabilities that many SIEMs simply don't provide."

This detection engineering challenge explains why organizations with strong detection engineering teams still struggle with SaaS coverage. Even with skilled engineers, building and maintaining domain-specific detection logic across a dozen SaaS platforms becomes a Sisyphean task without purpose-built tooling and data integration.

4️⃣ The Trust and Transparency Problem: How Do You Know AI Got It Right?

Perhaps the most challenging question for AI SOC platforms centers on trust. In security operations, false positives erode analyst confidence while false negatives create blind spots. When AI makes triage decisions, security leaders rightfully ask: "How do I know it got this right?"

Ariful's response draws again on the autonomous vehicle analogy: "There's no magic pixie dust for trust. For some time, you didn't see Waymo vehicles drive autonomously; you saw them with somebody inside. That's the phase we're in with SOC operations where you still need people driving."

Exaforce approached this through three mechanisms:

Transparency Through Complete Context: "You have to provide every aspect of the decision, the data the agent relied on, all relevant questions it answered to reach conclusions. You can't ask people to context switch to other data platforms to validate decisions." The system must surface all reasoning paths and supporting evidence inline.

Human-in-the-Loop Learning: Exaforce built a dedicated team to review AI triage decisions as humans, labeling true positives versus false positives. This evolved into a Managed Detection and Response (MDR) service that serves dual purposes: providing customers with better outcomes through human oversight while simultaneously improving the product through labeled training data. "We invest in MDR not just as a service but as a mechanism to build better technology," Ariful explains.

Continuous Learning from Historical Context: The system maintains historical data about past detections and their classifications. When a specific alert type has been marked as a false positive with documented reasoning, that context informs future triage decisions. Organizations can also provide business context knowledge bases, operational playbooks, environment-specific details that the system incorporates into every analysis.

"There's no easy answer," Ariful admits. "It's a combination of transparency, human validation loops, and continuous learning from historical classifications that helps us get better results over time."

5️⃣ Rethinking SOC Teams: The Rise of Full-Stack Security Engineers

If AI can handle alert triage, investigation assistance, and even multi-step response workflows, what does this mean for SOC staffing? Ariful sees a fundamental shift emerging that elevates rather than replaces human talent.

"I really think the future is where security engineers become full stack. You're not an analyst," Ariful states. "Your best analyst doesn't want to be the guy that's just doing investigations. Your best analyst is probably someone incredibly valuable to the organization in many other ways."

This perspective aligns with what Exaforce observes across customer engagements. Growth-stage companies rethinking their SOC approach are hiring differently: "They're not hiring analysts. They're hiring individuals that can perform all these tasks: security engineering, some detection work, incident response because they're going to leverage a combination of human talent and AI."

The traditional SOC staffing model buying a SIEM, hiring detection engineers, hiring L1/L2 analysts requires massive upfront capital and headcount investment with value realization potentially taking more than a year. In contrast, an AI-native approach allows a small team of full-stack security engineers to leverage technology for the mundane work while focusing on high-value threat hunting, complex investigations, and security engineering projects.

"If you're in the SOC, leverage AI to be a builder and a better defender not to do mundane tasks," Ariful emphasizes. This shift particularly benefits early-career security professionals. Those with 2-3 years of experience who know enough to ask good questions but want to advance to senior roles can use AI capabilities to punch above their weight class. They can engage with complex investigations and advanced analysis previously reserved for principal-level engineers, while AI handles the repetitive triage work that has historically created burnout and high turnover among SOC analysts.

6️⃣ Beyond Bedrock: The Engineering Reality of Production AI Systems

For organizations building on AWS, Amazon Bedrock provides an obvious starting point for LLM capabilities. Exaforce leverages Bedrock extensively and even won AWS's Partner Startup of the Year award. However, Ariful cautions that Bedrock alone doesn't constitute a production AI agent system.

"Bedrock is incredibly beneficial it's a native AWS service, so all your data stays within AWS infrastructure, which helps with data residency and sovereignty questions," Ariful notes. "But it's not just about Bedrock. Building a robust agent system goes well beyond what Bedrock offers."

Production requirements that teams must build themselves include:

Retry mechanisms: When agents reach out to third-party systems or individuals, how do you handle failures?
Asynchronous processing: If an agent performs multiple tasks, should they execute sequentially or in parallel?
Upgrade handling: When upgrading software, how do you ensure in-flight agent tasks complete or restart appropriately?
State management: How do you track agent progress across distributed systems?

Exaforce's engineering team evaluated open-source frameworks like LangChain but ultimately built custom infrastructure because production requirements exceeded what generic frameworks provided. "We obviously started with Bedrock quickly that's the benefit of cloud," Ariful explains. "But then there's all this other infrastructure you have to build for enterprise reliability."

This reality check matters for security leaders evaluating build-versus-buy decisions. The impressive demos of LLMs solving problems don't capture the operational complexity of running agent systems at scale in production security environments with SLAs and compliance requirements.

Threat Intelligence & Research
- CISA Emergency Directive 26-01 Full Text - Complete guidance on F5 BIG-IP remediation requirements
- Trail of Bits: Prompt Injection to RCE Research - Technical deep-dive on AI agent security vulnerabilities
- AWS Multi-Region Architecture Best Practices - Official guidance on region failover design
AI Security & Development
- Amazon Bedrock Documentation - Technical reference for building AI applications on AWS
- LangChain Documentation - Framework for developing LLM applications
- OWASP LLM Top 10 - Security risks in LLM applications

Can You Build an AI SOC with Claude Code? The Reality vs. Hype

Question for you? (Reply to this email)

⚙️ Is your SOC team spending more time managing queries or hunting threats?

Next week, we'll explore another critical aspect of cloud security. Stay tuned!

📬 Want weekly expert takes on AI & Cloud Security? [Subscribe here]”

We would love to hear from you📢 for a feature or topic request or if you would like to sponsor an edition of Cloud Security Newsletter.

Thank you for continuing to subscribe and Welcome to the new members in tis newsletter community💙

Peace!

Shilpi Bhattacharjee

Was this forwarded to you? You can Sign up here, to join our growing readership.

Want to sponsor the next newsletter edition! Lets make it happen

Have you joined our FREE Monthly Cloud Security Bootcamp yet?

checkout our sister podcast AI Security Podcast