AWS Inventory Extraction using CloudQuery

Extract your multi-account AWS inventory using CloudQuery



I wanted to extract the AWS Inventory spread out over multiple regions and accounts. Offcourse there are many solutions and different ways to do this but in this post I’d like to share how I was able to do this using an opensource tool called as CloudQuery

To put in extremely simple words , CloudQuery converts your Cloud(AWS,GCP and many more) asset information into a SQL Database. For Ex: Below are the columns for the table storing information about your EC2 instances


With the power of SQL, possibilities of utilising this data is endless as an example, if you’d like to identify which EC2 instances are still using IMDSv1 then simply fire the query as

select * from aws_ec2_instances where metadata_options_http_tokens = 'optional' ;


  • A PostgreSQL database. Below command can be used to quickly spin one up using docker.

docker run --name cloudquery_postgres -p 5432:5432 -v ${PWD}:/tmp -e POSTGRES_PASSWORD=pass -d postgres

  • A PostgreSQL Administrator tool that can connect to the PGSQL database that we’ll spin up
  • AWS Access Keys configure in ~/.aws/credentials
  • CloudQuery

NOTE: We are setting up a test bed here and not a production-grade solution.


Configuring CloudQuery is quite straightforward and can be easily followed through this link for AWS.

Once CloudQuery has been initialised we need to edit the config.hcl file which cloudquery will read in order to fetch the details. A sample config.hcl file is as shown below.

The accounts configured must match the different aws profiles configured in ~/.aws/credentials file.

// Configuration AutoGenerated by CloudQuery CLI
cloudquery {
  plugin_directory = "./cq/providers"
  policy_directory = "./cq/policies"

  provider "aws" {
    version = "latest"

// Ensure the credentials match to that of the docker command provided above.
  connection {
    dsn = "postgres://postgres:pass@localhost:5432/postgres?sslmode=disable"

// All Provider Configurations

provider "aws" {
  configuration {
// Configuring the different accounts.
    accounts "dev" {
      local_profile = "example-dev"
    accounts "prod" {
      local_profile = "example-prod"
    accounts "staging" {
      local_profile = "example-staging"
    // Optional. by default assumes all regions
    // regions = ["us-east-1", "us-west-2"]
    // Optional. Enable AWS SDK debug logging.
    aws_debug = false
    // The maximum number of times that a request will be retried for failures. Defaults to 5 retry attempts.
    // max_retries = 5
    // The maximum back off delay between attempts. The backoff delays exponentially with a jitter based on the number of attempts. Defaults to 60 seconds.
    // max_backoff = 30

  // list of resources to fetch
  resources = [

// Module Configurations
modules {
  // drift configuration block
  drift "drift-example" {
    // state block defines from where to access the state
    terraform {
      // backend: "local" or "s3"
      backend = "local"

      // local backend options
      // files: list of tfstate files
      files = ["/path/to.tfstate"]

      // s3 backend options
      // bucket   = "<tfstate bucket>"
      // keys     = [ "<tfstate key>" ]
      // region   = "us-east-1"
      // role_arn = ""

    // provider "aws" {
    //   account_ids      = ["123456789"]
    //   check_resources   = ["ec2.instances:*"]
    //   ignore_resources = ["ec2.instances:i-123456789", "aws_cloudwatchlogs_filters:*"]
    // }

Once the above stuff is in place fire cloudquery fetch and it’ll start populating the SQL database by accessing the AWS inventory from all the accounts that’ve been configured in the config.hcl file.

cloudquery fetch


Once the fetch command completes, open up your favourite PGAdmin console and fire the below SQL Query.

	tbl_name varchar;
	data_exists boolean ;
	FOR r IN SELECT table_name FROM information_schema.tables WHERE table_schema='public'
	tbl_name = r.table_name;
	RAISE NOTICE '%',tbl_name;
    EXECUTE format('select count(*) where exists (select 1 from %s)',tbl_name) INTO data_exists;
	RAISE NOTICE '%',data_exists;
	IF data_exists THEN
	   EXECUTE format('COPY %s TO ''%s'' WITH (FORMAT CSV, HEADER)',tbl_name,CONCAT('/tmp/',tbl_name,'.csv'));	

The Query above iterates through all the tables created by CloudQuery and exports only non-empty tables in CSV format. The tables aggregate resources from all accounts.

For Ex: if you have 5 S3 buckets , in 3 different accounts , CloudQuery will aggregate all the 5 S3 buckets as 5 rows and a column to desginate the associated account IDs.

End Result would be something like below in the directory from where you executed the docker command to spin up a PostgreSQL database.

AWS Inventory in CSV

That’s all folks hope you find this post useful :-)

Rohit Salecha
Security Engineer

A Security Engineer who is passionate about making the world more secure, one bit at a time. Also head over heels over Infrastructure as Code(IaC) and believes in “everything as code” mantra. He specializes in taking the “Shift Left” approach implementing the Secure Software Development Lifecycle (SSDLC) to develop solutions that are “secure by default”

comments powered by Disqus