Offloading Workflow Engine Traffic to S3 Using Pre-Signed Links
Over the past year, Activepieces has been growing rapidly, powering millions of workflow runs. Our workflow engine has been battle-tested, handling various creative and unexpected use cases. While we’re thrilled with this growth, it’s also pushed our infrastructure to its limits.
One major challenge we faced recently involved scaling the upload process for logs and step files associated with flow runs. Let me walk you through the problem, the solution, and how signed links helped us overcome it.
The Problem: Scaling File Uploads (Logs and Step Files)
In Activepieces, every flow run consists of multiple steps: a trigger, followed by inputs and outputs for each step. In addition to logs (used for debugging, reviewing API responses, and analyzing data), we handle various step files, such as log data or files generated during execution.
Here’s a typical flow structure:
How Files Were Handled Before
- When a worker processed a flow run, it generated logs and step files for each step.
- These files were sent to the app server, which stored them in an S3 bucket.
- Files used in the workflow (like step files) were also sent to the app for future replay.
As flow runs increased, so did the volume of logs and step files, which caused the app servers to become a bottleneck.
Here’s the infrastructure:
Challenges We Encountered
- High Traffic on App Servers: Logs and step files flooded app servers, slowing down performance.
- Scaling Issues: Adding app servers helped temporarily but was inefficient during spikes.
- Limited Capacity: App servers couldn’t handle surges, leading to delays and reduced reliability.
The Solution: Signed Links to the Rescue
We redesigned the file upload process for both logs and step files using signed links.
What Are Signed Links?
Signed links (or pre-signed URLs) are secure, time-limited URLs that allow temporary access to a resource in S3. They enable workers to upload files directly to S3 without an intermediary server.
Here’s how signed links work:
- A worker requests a signed link from the backend.
- The backend generates a signed link for secure file upload using S3 credentials.
- The worker uploads the file directly to S3.
The New Workflow
Here’s how we changed the process:
- Worker Initiates Upload: After processing a flow run, the worker calculates the size of both the log and step files and sends these details to the backend.
- Backend Provides Signed Links: The backend responds with pre-signed URLs for both the log and step files.
- Direct Upload: The worker uses the signed links to upload both files directly to S3.
With this new setup, the app server is no longer involved in handling file uploads.
Signed Links in Action
Here’s how signed links work for both logs and step files:
Step 1: Requesting Signed Links
The worker sends a request with file details (name and size):
{
"files": [
{"fileName": "log_12345.json", "fileSize": 1048576},
{"fileName": "step_file_1.json", "fileSize": 2048000}
]
}
Step 2: Backend Generates Signed Links
The backend generates signed URLs for each file:
const AWS = require('aws-sdk');
const s3 = new AWS.S3();
const params = {
Bucket: "my-bucket",
Key: "log_12345.json",
ContentLength: "12345",
Expires: 3600, // Link valid for 1 hour
ContentType: "application/json"
};
const signedUrlForLog = s3.getSignedUrl("putObject", params);
const signedUrlForStepFile = s3.getSignedUrl("putObject", {
Bucket: "my-bucket",
Key: "step_file_1.json",
ContentLength: "2048000",
Expires: 3600,
ContentType: "application/json"
});
console.log(signedUrlForLog, signedUrlForStepFile);
The signed links might look like:
https://my-bucket.s3.amazonaws.com/log_12345.json?AWSAccessKeyId=ACCESS_KEY&Expires=TIMESTAMP&Signature=SIGNATURE
https://my-bucket.s3.amazonaws.com/step_file_1.json?AWSAccessKeyId=ACCESS_KEY&Expires=TIMESTAMP&Signature=SIGNATURE
Step 3: Uploading the Files
The worker uploads both the log and step files directly to S3 using the signed links:
curl -X PUT -T log_12345.json \
"https://my-bucket.s3.amazonaws.com/log_12345.json?AWSAccessKeyId=ACCESS_KEY&Expires=TIMESTAMP&Signature=SIGNATURE"
curl -X PUT -T step_file_1.json \
"https://my-bucket.s3.amazonaws.com/step_file_1.json?AWSAccessKeyId=ACCESS_KEY&Expires=TIMESTAMP&Signature=SIGNATURE"
Results: Before vs. After
Aspect | Before | After |
---|---|---|
File Upload | Worker → App → S3 | Worker → S3 |
Traffic Bottleneck | App server overloaded | No load on app servers |
Scalability | Limited by app server capacity | Only limited by S3 capacity |
Cost Efficiency | More app servers required | Fewer app servers needed |
Conclusion
Switching to signed links for logs and step files reduced the load on app servers and improved scalability. This approach allowed us to handle more files efficiently without overwhelming infrastructure. If you're facing similar scaling challenges, signed links are a simple, effective solution.