/
...
/
/
8. Compass: Composable & Scalable Signal Engineering, Justin Young, Abnormal Security
Search
Duplicate
Try Notion

8. Compass: Composable & Scalable Signal Engineering, Justin Young, Abnormal Security

Use case: stop all email attacks
Need high recall
High precision
Rapid iteration cycle, because attackers update their techniques
Mantra: move fast and don’t break things
How to build an ML platform?
Prevent entanglement
Prevent online / offline skew
Entanglement
“Changing anything changes everything”
Hard to hard signal and basically impossible to remove ones, with an effect anywhere later on.
Function composition: f(x) = h(x, g(x))
Function composition are the primary responsibility of ML pipeline
Unconstrained function composition is the root of entanglement
Total dependencies grow square with the total of signals
So we need constrained function composition instead
How does the signal dag work in production?
Python
Copy
@compass.single_output_transformation("variable_len") def extract_len(input_variable: str) -> int: return len(input_variable)
Key features of dag:
Data structure append-only
Easy to understand the pipeline: list dependencies and viz
Validate properties of the pipeline: no cycle, unused branch, every signal produced exactly by one transformation
Platformize pipeline: propagate errors downstream, parallel execution of parallel Dags
How to execute this offline/online at scale?
Most of the time offline and online transformation will use the same transformations
Key takeaway: make sure your ML pipeline implements both a feature store and a function store!
The system should be function centric