Optimizing Large-Scale Data Exports: FastAPI, GraphQL, and AWS S3

Vinay Chaudhari

  1. Apr 08, 2025
  2. 4 min read

Introduction

Struggling with slow, inefficient file downloads in your application? Generating and handling large file downloads can be a nightmare—UI freezing, server overload, and endless waiting. But what if you could make it blazing fast, scalable, and real-time? In this blog, we’ll walk you through an event-driven, modern approach using FastAPI, GraphQL, AWS SQS, and GraphQL Subscriptions to handle massive datasets (5M+ rows) effortlessly!

Challenges with Large File Downloads

1. UI Freezing Due to Large File Generation

  • When a file is generated synchronously on the backend, the request remains open, blocking other operations.

2. Server Overload & Scalability Issues

  • Large files stored on the backend can consume significant resources.
  • Traditional polling methods increase unnecessary API calls and degrade performance.

3. Inefficient Polling for File Status

  • Repeated polling increases server load and slows down response times.
  • No real-time updates for file readiness.

4. Scaling Background Processing Efficiently

  • Using traditional Redis Queue (RQ Worker) or Celery may require manual infrastructure scaling.
  • A serverless, event-driven approach provides better scalability and cost efficiency.

5. Efficiently Handling Large Datasets (5+ Million Rows)

  • Querying a large dataset at once can overwhelm memory and slow down performance.
  • Efficient batch processing and streaming are required to handle such datasets.

Industry-Standard Optimized Approach

We solve these challenges using:

  • FastAPI for high-performance API processing.
  • AWS SQS and Lambda for serverless background task processing.
  • AWS S3 Signed URLs for scalable file storage and secure downloads.
  • GraphQL Subscriptions (Hasura/Apollo) for real-time status updates.
  • Chunked File Transfers for optimized large file downloads.
  • GraphQL for efficient data fetching and batch processing.

Solution Overview

Tech Stack Used:

  • Frontend: Vue.js (with Quasar Framework) + Apollo GraphQL Client
  • Backend: FastAPI (Python) + GraphQL + GraphQL Subscriptions (Hasura/Apollo)
  • Task Queue: AWS SQS + AWS Lambda
  • Storage: AWS S3 with Signed URLs
  • Event-Driven Processing: AWS Lambda + EventBridge

Workflow Breakdown:

  1. User requests file generation → Vue.js sends a request to the backend.
  2. Backend fetches data from GraphQL in batches and publishes tasks to AWS SQS → Tasks get queued in SQS.
  3. AWS Lambda listens for SQS messages → Processes batched database results, generates the file incrementally, and uploads it to AWS S3.
  4. File status updates are sent via GraphQL Subscriptions → Real-time updates without polling.
  5. Vue.js listens to GraphQL Subscription → UI updates instantly when the file is ready.
  6. Download via AWS S3 Signed URL → User downloads the file securely from AWS.

Backend Implementation (FastAPI + GraphQL + AWS SQS + Lambda + GraphQL)

1. Install Dependencies


pip install fastapi boto3 requests gql pandas xlsxwriter

2. FastAPI Backend (app.py)


from fastapi import FastAPI
import boto3
import json
import time
import requests

app = FastAPI()

# AWS Configuration
SQS_QUEUE_URL = "https://sqs.us-east-1.amazonaws.com/your-account-id/your-queue"
S3_BUCKET_NAME = "your-bucket-name"
AWS_REGION = "us-east-1"

sqs = boto3.client("sqs", region_name=AWS_REGION)
s3_client = boto3.client("s3", region_name=AWS_REGION)

GRAPHQL_ENDPOINT = "https://your-graphql-api.com/graphql"

@app.post("/generate-file")
def generate_file():
    """Handles file generation by querying GraphQL in batches and sending tasks to AWS SQS."""
    task_id = str(int(time.time()))
    query = """
    query GetLargeTableData($offset: Int!, $limit: Int!) {
      largeTable(offset: $offset, limit: $limit) {
        id
        column1
        column2
      }
    }
    """
    batch_size = 100000
    offset = 0

    while True:
        response = requests.post(GRAPHQL_ENDPOINT, json={"query": query, "variables": {"offset": offset, "limit": batch_size}})
        data = response.json().get("data", {}).get("largeTable", [])
        
        if not data:
            break
        
        message = json.dumps({"task_id": task_id, "data": data})
        sqs.send_message(QueueUrl=SQS_QUEUE_URL, MessageBody=message)
        offset += batch_size

    print(f"Task {task_id} successfully queued!")
    return {"task_id": task_id}

Frontend Implementation (Vue.js + Apollo GraphQL Client)

1. Install Dependencies


npm install @vue/apollo-composable graphql apollo-client

2. Vue Component for File Generation & Download


<template>
  <div>
    <q-btn label="Generate File" @click="generateFile" />
    <q-spinner v-if="loading" color="primary" />
    <a v-if="downloadUrl" :href="downloadUrl" download>
      <q-btn label="Download File" color="green" />
    </a>
  </div>
</template>

<script>
import { useSubscription } from '@vue/apollo-composable';
import gql from 'graphql-tag';

export default {
  data() {
    return {
      taskId: null,
      loading: false,
      downloadUrl: null,
    };
  },
  methods: {
    async generateFile() {
      this.loading = true;
      const response = await fetch("/generate-file", { method: "POST" });
      const data = await response.json();
      this.taskId = data.task_id;
      this.listenForUpdates();
    },
    listenForUpdates() {
      const { result, loading, error } = useSubscription(gql`
        subscription OnFileGenerated($taskId: String!) {
          fileStatusUpdated(task_id: $taskId) {
            download_url
          }
        }
      `, { taskId: this.taskId });
      
      watchEffect(() => {
        if (error.value) {
          console.error("GraphQL Subscription Error:", error.value);
          this.loading = false;
        }
        if (!loading.value && result.value) {
          this.downloadUrl = result.value.fileStatusUpdated.download_url;
          this.loading = false;
        }
      });
    }
  }
};
</script>

Final Thoughts

This FastAPI + GraphQL + AWS SQS + Lambda + GraphQL Subscriptions + S3 Signed URLs approach ensures a serverless, real-time, scalable solution for large file downloads. By leveraging efficient batch processing for querying 5M+ rows, event-driven processing, and secure cloud storage, we eliminate performance bottlenecks and improve the user experience.

Ready to transform your file processing experience? If you're working with millions of rows and need a fast, scalable, and serverless solution, this is your answer! Try it out and let us know how it works for you!

About Author
Vinay Chaudhari

See What Our Clients Say

Mindgap

Incentius has been a fantastic partner for us. Their strong expertise in technology helped deliver some complex solutions for our customers within challenging timelines. Specific call out to Sujeet and his team who developed custom sales analytics dashboards in SFDC for a SoCal based healthcare diagnostics client of ours. Their professionalism, expertise, and flexibility to adjust to client needs were greatly appreciated. MindGap is excited to continue to work with Incentius and add value to our customers.

Samik Banerjee

Founder & CEO

World at Work

Having worked so closely for half a year on our website project, I wanted to thank Incentius for all your fantastic work and efforts that helped us deliver a truly valuable experience to our WorldatWork members. I am in awe of the skills, passion, patience, and above all, the ownership that you brought to this project every day! I do not say this lightly, but we would not have been able to deliver a flawless product, but for you. I am sure you'll help many organizations and projects as your skills and professionalism are truly amazing.

Shantanu Bayaskar

Senior Project Manager

Gogla

It was a pleasure working with Incentius to build a data collection platform for the off-grid solar sector in India. It is rare to find a team with a combination of good understanding of business as well as great technological know-how. Incentius team has this perfect combination, especially their technical expertise is much appreciated. We had a fantastic time working with their expert team, especially with Amit.

Viraj gada

Gogla

Humblx

Choosing Incentius to work with is one of the decisions we are extremely happy with. It's been a pleasure working with their team. They have been tremendously helpful and efficient through the intense development cycle that we went through recently. The team at Incentius is truly agile and open to a discussion in regards to making tweaks and adding features that may add value to the overall solution. We found them willing to go the extra mile for us and it felt like working with someone who rooted for us to win.

Samir Dayal Singh

CEO Humblx

Transportation & Logistics Consulting Organization

Incentius is very flexible and accommodating to our specific needs as an organization. In a world where approaches and strategies are constantly changing, it is invaluable to have an outsourcer who is able to adjust quickly to shifts in the business environment.

Transportation & Logistics Consulting Organization

Consultant

Mudraksh & McShaw

Incentius was instrumental in bringing the visualization aspect into our investment and trading business. They helped us organize our trading algorithms processing framework, review our backtests and analyze results in an efficient, visual manner.

Priyank Dutt Dwivedi

Mudraksh & McShaw Advisory

Leading Healthcare Consulting Organization

The Incentius resource was highly motivated and developed a complex forecasting model with minimal supervision. He was thorough with quality checks and kept on top of multiple changes.

Leading Healthcare Consulting Organization

Sr. Principal

US Fortune 100 Telecommunications Company

The Incentius resource was highly motivated and developed a complex forecasting model with minimal supervision. He was thorough with quality checks and kept on top of multiple changes.

Incentive Compensation

Sr. Director

Most Read
Revolutionizing Business with Data Analytics and AI in 2025

How can businesses identify untapped opportunities, improve efficiency, and design more effective marketing campaigns? The answer lies in leveraging the power of data. Today, data analytics isn’t just a support function—it’s the backbone of decision-making. When combined with Artificial Intelligence (AI), it transforms how companies operate, enabling them to predict trends, optimize operations, and deliver better customer experiences.

Marketing

  1. Dec 04, 2024
  2. 4 min read
Basics of AWS VPC: Understanding Subnets, Route Tables, Internet Gateways, and NAT Gateways

Amazon Virtual Private Cloud (VPC) is a virtual network allocated to your AWS account. If you are wondering what a virtual network is, it allows communication between computers, servers, or other devices. VPC allows you to start AWS resources like EC2(Server) in your virtual network.

Mayank Patel

  1. Nov 29, 2024
  2. 4 min read
UX Gamification in Enterprise Software: Boosting Productivity Through Play

In the world of enterprise software, we often focus on making things efficient, functional, and sometimes, well, boring. But what if work didn’t have to feel like work all the time? That’s where gamification comes in. By borrowing elements from games—like points, rewards, and challenges—we can make enterprise tools more engaging and, surprisingly, boost productivity along the way.

Jaskaran Singh

  1. Nov 26, 2024
  2. 4 min read
Generative AI in Data Analytics: Challenges and Benefits

In today's digital era, data is being generated at every turn. Every interaction, transaction, and process creates valuable information, yet transforming this raw data into insights that can drive business decisions remains a significant challenge for many organizations.

Chetan Patel

  1. Nov 22, 2024
  2. 4 min read