- Cloud Security Newsletter
- Posts
- 🚨 ServiceNow Acquires Veza for $1B* as Identity Becomes Critical Attack Vector: Lessons from Building Cloud-Native Data Lakes at Scale
🚨 ServiceNow Acquires Veza for $1B* as Identity Becomes Critical Attack Vector: Lessons from Building Cloud-Native Data Lakes at Scale
This week covers ServiceNow's strategic $1B* acquisition of identity security firm Veza, the Oracle E-Business Suite zero-day campaign affecting 100+ organizations, and Claude AI plugins shown deploying ransomware. Security expert Cliff Crosford shares hard-won lessons from building security data lakes at scale, addressing SIEM cost challenges, and implementing AI-driven detection pipelines for enterprise cloud security teams.
Hello from the Cloud-verse!
This week’s Cloud Security Newsletter Topic we cover - From SIEM to Data Lake: Why Security Teams Are Making the Shift (And What It Actually Takes) (continue reading)
Incase, this is your 1st Cloud Security Newsletter! You are in good company!
You are reading this issue along with your friends and colleagues from companies like Netflix, Citi, JP Morgan, Linkedin, Reddit, Github, Gitlab, CapitalOne, Robinhood, HSBC, British Airways, Airbnb, Block, Booking Inc & more who subscribe to this newsletter, who like you want to learn what’s new with Cloud Security each week from their industry peers like many others who listen to Cloud Security Podcast & AI Security Podcast every week.
Welcome to this week’s Cloud Security Newsletter
As identity security takes center stage with ServiceNow's billion-dollar acquisition of Veza and AI security vulnerabilities proliferate across enterprise environments, security teams face a critical inflection point: how do we scale visibility without breaking the bank?
This week, we're joined by Cliff Crosford, co-founder of Scanner.dev and veteran of multiple security startups including an acquisition into Cisco. Cliff shares hard-won lessons from building security data lakes at massive scale including what works, what fails spectacularly, and why most teams underestimate the engineering lift required to make data lakes actually usable for security operations. [Listen to the episode]
đź“° TL;DR for Busy Readers
ServiceNow's $1B* Veza acquisition signals identity + ITSM consolidation- Identity is consolidating fast with ServiceNow’s Veza move, it is a clear play to own “who has access to what” across cloud and SaaS; revisit your IAM/CIEM roadmap now.
Claude AI "Skills" plugins shown deploying ransomware AI plug-ins are the new PowerShell – Claude Skills and shadow AI in the browser demand Tier 0 governance and an allowlist model, not “free install for everyone.” treat AI tools as Tier 0 code execution environments.
Oracle EBS zero-day hits 100+ orgs including Penn & Phoenix Universities confirm patch status and SSO integration points immediately
AWS and Google Announce Joint Multi-cloud Networking Service- Multicloud networking is getting easier and riskier. AWS + Google’s joint fabric is great for latency and terrible for flat, poorly governed architectures.
Data lakes promise SIEM cost savings but require serious engineering budget for schema maintenance, not just storage
AI can accelerate detection tuning 80% but humans remain essential implement human-in-the-loop review for all AI-generated detections
đź“° THIS WEEK'S TOP 5 SECURITY HEADLINES
1. ServiceNow Moves to Acquire Veza in ~$1B Identity Security Deal
What Happened
ServiceNow announced a definitive agreement to acquire identity security company Veza in a deal reportedly valued at around $1 billion, according to SecurityWeek's M&A tracker. Veza specializes in authorization and access intelligence across modern infrastructure and SaaS, mapping "who has access to what" across cloud providers, data platforms, and business applications.
Why It Matters
Identity is the new control plane in cloud-native environments. Veza's entitlement graph across AWS, Azure, GCP, SaaS, and data stores represents exactly the visibility layer many organizations are attempting to build themselves. This acquisition signals further consolidation of identity, ITSM, and security operations into single platforms a trend that creates both opportunities and risks.
As Cliff Crosford notes from his experience building security infrastructure at scale: "Log volumes just get massive and then it becomes impossible to keep all of the logs that you want in your SIEM. Traditional SIEMs were wonderful in the era when you had maybe individual gigabytes or tens of gigabytes of logs per day." ServiceNow's move to acquire Veza positions them to become the central control plane not just for workflow, but for identity telemetry at enterprise scale.
For large enterprises, the strategic implications are significant: tighter vendor integration enables faster response automation, but it also creates a higher blast radius when identity metadata and agentic controls live inside a single vendor platform. Organizations already using ServiceNow for ITSM should immediately map where identity artifacts flow today and prepare for integration changes.
Actionable Steps:
Inventory your current IAM/CIEM/DSPM/IGA stack and note overlaps with Veza's capabilities
If you're a ServiceNow customer, ask reps about roadmap timelines, data residency, and API access
Track this as a signal that "identity + operations + security" convergence is accelerating
Update procurement risk registers to evaluate lock-in and data residency implications
Source: SecurityWeek M&A Tracker
2. Oracle E-Business Suite Zero-Day Campaign Widens – Penn & Phoenix Universities Confirm Breaches
What Happened
The University of Pennsylvania and University of Phoenix disclosed they're victims of the ongoing Oracle E-Business Suite (EBS) hacking campaign, which has already impacted more than 100 organizations. Attackers compromised EBS instances used for supplier payments, general ledger, and other core business processes, exposing PII, dates of birth, Social Security numbers, and banking details. The campaign is linked to Cl0p ransomware activity, with FIN11 suspected behind the intrusion chain, and appears to leverage unknown zero-day vulnerabilities in Oracle EBS that remain publicly undisclosed.
Why It Matters
ERP systems like Oracle EBS sit at the heart of finance, procurement, payroll, and student information and are increasingly hosted in cloud and hybrid environments. A compromise here is not just a "data breach"; it's a business operations incident that can halt core business functions.
The campaign highlights the opacity of third-party risk when organizations consume Oracle as a managed service or via integrators. Limited visibility into vendor-hosted ERP can leave security teams blindsided when zero-days are exploited at scale. Once attackers gain access to EBS, they're adjacently close to SSO integrations, data lakes, and downstream SaaS applications that consume ERP data multiplying the blast radius if identities or integration credentials are reused.
Actionable Steps:
Immediately confirm whether you run Oracle EBS (on-prem or hosted) or if vendors do on your behalf
Demand clear answers on patch level and mitigations applied against the EBS campaign
Review how ERP authentication integrates with IdP, VPNs, and cloud platforms
Add ERP systems to attack surface management and tabletop exercises
Prepare procedures to isolate ERP environments and rotate integration credentials quickly
3. Claude "Skills" Plug-ins Shown Able to Deploy Ransomware
What Happened
A Cato Networks researcher demonstrated that an Anthropic Claude "Skill" a plug-in used by Claude Code to automate tasks can be modified to deploy MedusaLocker ransomware without being flagged by the model. By inserting a seemingly benign function into Anthropic's open-source "GIF Creator" Skill that downloads and executes external code, the researcher showed that Claude reviews only the visible Skill code, not remote payloads fetched at runtime. Anyone can download, tweak, and re-upload Skills in similar fashion, and Anthropic's current stance is that users are responsible for trusting Skills.
In parallel, security researchers are warning about "shadow AI in the browser" unmanaged AI extensions and agentic browsers that can read data across SaaS tabs and silently exfiltrate it, often outside CASB/DLP visibility.
Why It Matters
AI tools are becoming the new "PowerShell" of enterprise environments. Skills, plug-ins, and AI-controlled agents are code execution environments with access to source code, secrets, and internal SaaS not harmless productivity helpers. The Claude demonstration shows how little effort is required to turn them into delivery vectors for ransomware and other malware.
Traditional code review, software composition analysis, and CASB controls often ignore AI plug-in ecosystems and browser agents, even though they can read session cookies, internal dashboards, and cloud consoles in the browser context. Auto-updating Skills and extensions create an AI supply chain where a poisoned update or compromised extension publisher can instantly impact thousands of developers and analysts.
This connects directly to Cliff's point about data engineering complexity: "Every log source is quite a bit of work, and you will be on this endless journey of maintaining a data lake forever." Just as security teams must continuously maintain log pipelines, they now must maintain AI agent governance treating these tools as critical infrastructure requiring the same rigor as production code.
Actionable Steps:
Treat AI plugins/Skills/Agents as Tier 0 code requiring allowlisting and code review
Require provenance checks for internal or forked Skills, just like internal libraries
Update acceptable use and AI policies to cover approved tools and prohibited data types
Implement restrictions on installing AI extensions/agentic browsers on corporate endpoints
Ask EDR and browser security vendors how they detect AI-driven automation and token exfiltration
4. AWS and Google Announce Joint Multicloud Networking Service
What Happened
AWS and Google Cloud jointly launched a new multicloud networking service combining AWS Interconnect–multicloud with Google Cloud's Cross-Cloud Interconnect, allowing customers to establish private, high-speed links between AWS and GCP environments in minutes instead of weeks. The providers also introduced an open specification for network interoperability, with Salesforce named as a day-one user.
Why It Matters
Multicloud private connectivity is becoming dramatically easier great for latency and reliability, but it also increases the need for tight segmentation, routing governance, and inspection between clouds. Many enterprises currently rely on DIY IPsec or third-party SD-WAN for multicloud connections. This offering will tempt teams to move to provider-managed connectivity, which can be beneficial but security controls must move with you (firewalls, IDS/IPS, service mesh, policy as code).
Once AWS and GCP are connected via one high-speed "trusted" fabric, misconfigurations in one cloud (e.g., flat VPCs, overly broad peering) can more easily propagate risk to the other. As organizations build cross-cloud data lakes and security operations, this interconnect becomes a critical attack surface requiring dedicated monitoring and segmentation.
Actionable Steps:
Request a threat model of current and planned multicloud connectivity from your cloud networking team
Define security guardrails: mandatory use of segmented VRFs/VPCs/projects and clear patterns for east-west inspection
Update cloud architecture standards to ensure cross-cloud links are visible to SecOps
Integrate new cross-cloud resources into logging, NDR, and incident response workflows
Review network segmentation and egress policies to ensure flow logs cover cross-cloud transport
🎯 Cloud Security Topic of the Week:
From SIEM to Data Lake: Why Security Teams Are Making the Shift (And What It Actually Takes)
The traditional SIEM model is breaking under the weight of modern log volumes. This week's conversation with Cliff Crosford reveals a painful truth that many security leaders are discovering: when your log volume reaches multiple terabytes per day, the economics of traditional SIEMs become untenable, forcing impossible choices about which logs to drop and which blind spots to accept.
Featured Experts This Week 🎤
Cliff Crosford Co-founder, Scanner.dev
Ashish Rajan - CISO | Co-Host AI Security Podcast , Host of Cloud Security Podcast
Definitions and Core Concepts 📚
Before diving into our insights, let's clarify some key terms:
Data Lake vs. Traditional SIEM: A data lake is an architectural approach that stores massive volumes of raw data in object storage (like S3) at significantly lower cost than traditional SIEMs. Unlike SIEMs that require structured ingestion and charge by volume, data lakes can store petabytes of logs indefinitely. However, they require significant data engineering to make logs searchable and useful.
OCSF (Open Cybersecurity Schema Framework): A vendor-agnostic schema that provides a common structure for security log data across different sources. While useful for standardization, it requires significant transformation work to fit diverse log sources into its strict schema requirements.
Schema Normalization: The process of transforming logs from different sources into a consistent structure with standardized field names. This enables correlation across log sources but requires ongoing maintenance as applications evolve and schemas change.
Security Data Lake
A log and event repository built on cloud object storage (e.g., S3, GCS, ADLS) plus engines like Presto/Trino, Spark, or specialized lake query engines. Optimized for cheap, long-term storage and large-scale analytics not for the interactive alerting workflows SIEMs excel at.Traditional SIEM
Platforms like Splunk, Elastic, QRadar, etc. Great for parsing, normalizing, and searching logs up to a point. Licensing is usually volume-based, which becomes prohibitive beyond hundreds of GB/day, pushing teams to sample or drop logs.Normalization vs “Messy Data”
Normalization = forcing all logs into a consistent schema. In practice, custom apps and constantly changing sources make perfect normalization a full-time job. Cliff argues for “best-effort normalization” + tools that can search nested JSON and free text without requiring strict tables.Detection Engineering
The practice of designing, tuning, and maintaining detection rules and pipelines. In a data lake world, this includes SQL rules, full-text searches, anomaly jobs, and AI-assisted investigations, all closely tied to schema evolution and data quality.
This week's issue is sponsored by Drata
Security teams shouldn’t be buried in manual evidence collection.
Drata automates compliance end-to-end while providing unified visibility across cloud workloads, identities, and configurations.
Teams use Drata to cut audit prep from weeks to hours, accelerate security reviews, and reinforce DevSecOps pipelines with real-time controls monitoring.
If you're scaling cloud infrastructure and need a smarter path to continuous compliance, Drata is built for you.
💡Our Insights from this Practitioner 🔍
SIEM vs. Data Lake: Why We Ditched Traditional Logging? (Full Episode here)
The Economic Reality Forcing Change
Cliff's experience building security infrastructure at a prior startup (later acquired by Cisco) illustrates the breaking point many organizations face. "To increase our license, it would've been more expensive than the entire budget for the engineering team," Cliff explains. When faced with exploding log volumes at his previous company, his team redirected 90% of log data to S3 buckets a seemingly logical cost-saving measure.
But the reality proved more complex. "It became a bit of a black hole where like you couldn't really search through very much data once the data set became large. Querying that data lake at S3 became more and more painful over time."
At modern scale, volume-based SIEM licensing becomes economically impossible. Data lakes promise cheaper, near-infinite retention for multi-TB/day security data and the ability to keep all security-relevant logs instead of sampling or dropping them. They also provide a better substrate for AI-driven detection and investigation. But today, they come with hard trade-offs around engineering effort, usability, and search performance.
What looked like a financial win turned into an operational nightmare. The logs existed, but they were essentially unusable for actual security investigations. This experience highlights a fundamental misunderstanding about data lakes: cheap storage is only valuable if you can actually query the data when you need it. For security teams, that means during active incidents when every minute counts.
When Cliff's team tried to use Athena for queries, they hit the wall that many organizations eventually face: "The queries would take three hours to run and might cost a few hundred dollars." This isn't a configuration problem it's an architectural reality of how SQL-based data lake engines work.
The Engineering Reality Most Teams Underestimate
The fundamental issue? "Every log source is quite a bit of work, and you will be on this endless journey of maintaining a data lake forever," Cliff warns. Security teams often think of data lake migration as a one-time project, but his experience reveals it as an ongoing engineering commitment. Each new log source requires custom work to fetch, transform, and fit into your schema. And when applications update and schemas change which happens constantly your detection rules break.
One particularly painful reality: "Basically every week, at least one of them was misbehaving because the schema changed a little bit. New fields showed up that were important or got renamed and then suddenly it stopped getting inserted." For a team managing 40-50 log sources, this becomes a weekly firefighting exercise.
Why Traditional Data Lake Tools Fail Security Teams
The challenge goes deeper than engineering effort. Most data lake tools were designed for business analytics structured, columnar data that fits neatly into SQL tables. Security logs are fundamentally different. As Cliff explains: "For a lot of security logs they can be a lot messier. They can be like deeply nested JSON or lots of text like PowerShell command line text and so on. That's where SQL engines that are the typical data lake engines really break down."
This mismatch creates a paradox: you move to a data lake for visibility and cost savings, but then you can't effectively search the very data you're storing. Traditional full-text search capabilities that security teams rely on in Splunk or Elastic simply don't exist in most SQL-based data lake platforms like Athena, Presto, or Trino
The Hard Choices: What Gets Sacrificed?
When SIEM costs become prohibitive, security teams face brutal triage decisions. Cliff describes a common pattern: "They'll use Cribl to delete fields from logs, filter them down, sample them down try to just keep the log volume down to avoid spending too much on ingestion volume."
But here's the critical difference between SRE and security use cases: "For observability use cases, getting sampled data is all right, but for security teams, it can be pretty terrifying to be like, well, I'm only keeping like 20% of my log data. The threat actor activity and maybe the IOCs that I care about, like malicious IP addresses, are invisible to me."
Sampling works for understanding system health. It's catastrophic for security, where a single malicious event can represent a critical breach. The threat actor doesn't show up in every log entry just the ones you chose to drop.
Amazon Security Lake: Promise vs. Reality
Many teams look to AWS Security Lake as a turnkey solution, but Cliff's analysis reveals important limitations. While it's "a really good first step" with built-in support for many log sources and automatic OCSF (Open Cybersecurity Schema Framework) transformation, two major challenges remain:
First, the custom log problem: "For custom log sources or log sources that aren't in their list of supported log sources, you're gonna have to do the work to get it to fit into this very strict schema, and that can be a massive amount of work." Given that most enterprises run hundreds of custom applications, this isn't an edge case it's the majority use case.
Second, the search performance issue: "If you have things like command line text from EDR logs like PowerShell commands, you're trying to do substring search and really understand messier log data that is unfortunately still quite slow in the data lake." The platform is optimized for columnar, SQL-friendly data, not the messy reality of security logs.
The AI-Assisted Future (With Important Caveats)
Cliff sees AI as a genuine accelerator for data lake workflows, but with critical human-in-the-loop requirements. "AI can really help you figure out how to normalize your data if you really need to fit it into a SQL schema. It will get you like 80% of the way there. It can be kind of there's still like a bit more, it'll hallucinate a little bit."
For detection engineering, AI becomes particularly powerful: "Because it kind of knows everything that's out there, it can be a really great brainstorming partner." But Cliff is clear about boundaries: "My opinion for now is that it's not yet ready to be fully trusted with important investigations and response. It is good at getting started, but I think humans are still very much needed."
The most promising pattern he's seeing? Automated detection tuning: "An alert goes off, an agent takes a first cut at the investigation, then will make a recommendation for what the detection rule should probably be like. Maybe this is too noisy and we should tune it. Then it will go open up a pull request in GitHub, and then the team can just review it, accept it, and then the detection rule is now tuned better."
This closes the loop on detection engineering without requiring manual Jira tickets and waiting for vacation schedules. But notice the pattern: AI proposes, humans approve. The expertise remains human; the acceleration comes from AI.
Strategic Recommendations for 2026 Planning
For CISOs evaluating the SIEM vs. data lake decision in 2026 planning, Cliff offers pragmatic guidance: "I don't think it's quite time to totally replace your SIEM with a data lake. One pattern that tends to work well is to say, 'cool, all my logs used to fit in my SIEM a few years ago. Now it's like 10% of them. Let me keep those 10% going to my SIEM... and then nine terabytes a day of logs instead of dropping them entirely let's make the first step which is just store them in S3 for compliance purposes.'"
This hybrid approach acknowledges reality: traditional SIEMs still provide better usability for hot-path investigations, while data lakes solve the retention and cost problem for the long tail of logs you can't afford to keep in your SIEM.
The decision of whether to build or buy depends on your team composition: "If you have a really strong data engineering team, whether in the security side or the rest of your organization already doing a lot of data engineering with data lakes for other purposes like business analytics or observability, then yeah, you can share that work with them. That can be a fun project. It is a forever project though."
But for teams without deep data engineering resources: "If you don't have that data engineering talent and resources, you'll probably need to buy something." Be honest about your team's capabilities before committing to a multi-year engineering project.
The Schema Evolution Problem Nobody Talks About
Perhaps the most underestimated challenge Cliff identifies is schema evolution. "Tools in the future need to embrace the fact that logs are going to be messy. We as humans can kind of see these schema changes, be like, 'eh, I get it. I get what this new field means.'"
This human ability to adapt to changing schemas to understand that a renamed field still represents the same data is something current data lake tools completely lack. When an application updates and changes field names, your carefully constructed SQL queries break, your detections fail, and you lose visibility until someone manually fixes the schema mapping.
The future Cliff envisions: "Messiness will be embraced more" tools that can handle schema drift without requiring constant manual intervention. Until then, budget for weekly schema maintenance as part of your data lake TCO.
OCSF & Schema Standards
Open Cybersecurity Schema Framework (OCSF) - Vendor-agnostic schema for security logs
AWS Security Lake Documentation - Managed data lake for security data
AI-Assisted Security Operations
Model Context Protocol (MCP) - Connect AI assistants to external data sources
Cursor - AI-powered code editor for automation scripts
GitHub Copilot for Security - AI assistance for security workflows
Cloud Security Podcast
For deeper discussion on failed data lakes, AI in detection engineering, and where SIEM still fits.
Question for you? (Reply to this email)
🤖 Would you be Team SIEM or Team Security Data Lake?
Next week, we'll explore another critical aspect of cloud security. Stay tuned!
📬 Want weekly expert takes on AI & Cloud Security? [Subscribe here]”
We would love to hear from you📢 for a feature or topic request or if you would like to sponsor an edition of Cloud Security Newsletter.
Thank you for continuing to subscribe and Welcome to the new members in tis newsletter communityđź’™
Peace!
Was this forwarded to you? You can Sign up here, to join our growing readership.
Want to sponsor the next newsletter edition! Lets make it happen
Have you joined our FREE Monthly Cloud Security Bootcamp yet?
checkout our sister podcast AI Security Podcast


