airbytehq/quickstarts

Pokémon Data Stack

Opened this issue · 18 comments

Pokémon Analysis and Insights with Airbyte

Extract Pokémon, ability, or move data from the PokeAPI using Airbyte. Load the data into a warehouse for in-depth analysis on Pokémon attributes, popularity, or battle strategies.

You can use a tool like dbt for data transformation, and an orchestrator like Airflow or Dagster if needed.

How to get started:

Hi!, I would like to work on these fine Pokémon.

Hi @CodeResolver! Sure, I have assigned it to you :). Let me know if you have any questions.

Hi @CodeResolver! Are you still working on this? Otherwise I may need to unassign. Let me know :)

Hi @ThaliaBarrera yes I am, was having issues with Terraform and the ecommerce quickstart, keep getting a 405 error so I just moved to a MacOs to try again. Quick question, is Terraform required and wondering if maybe I should use the MongoDB to Mysql quickstart instead as an example for this integration?

Hi again , also while researching found this video and he is actually going through the PokeAPI, should I implement what is shown there and expand on it ? thanks and sorry for all the questions : )
https://www.youtube.com/watch?v=kJ3hLoNfz_E

@CodeResolver yes indeed you need Terraform. and for ur 2nd quesstion as per I've seen the video is related to building a python CDK but to build a quickstart you'll need to follow the Ecommerce analytics one its well written and explained by @ThaliaBarrera . More info can be given by @ThaliaBarrera . Ty ^_^

Thanks for replying @bishalbera!

@CodeResolver maybe you can share your Terraform code and the error you're getting so we can further help

Hi @ThaliaBarrera and @bishalbera , thanks for your help, replicating it now on Mac and ill send you the error if it shows up again.

Hi again, was finally able to setup terraform correctly, thanks for your help. While working with dbt I do run into these errors, do you happen to know why I'm getting these:

18:06:31 Running with dbt=1.6.6
18:06:32 Registered adapter: bigquery=1.6.8
18:06:32 Found 6 models, 3 sources, 0 exposures, 0 metrics, 394 macros, 0 groups, 0 semantic models
18:06:32
18:06:33 Concurrency: 1 threads (target='dev')
18:06:33
18:06:33 1 of 6 START sql view model transformed_data.stg_products ...................... [RUN]
18:06:34 1 of 6 ERROR creating sql view model transformed_data.stg_products ............. [ERROR in 0.68s]
18:06:34 2 of 6 START sql view model transformed_data.stg_purchases ..................... [RUN]
18:06:35 2 of 6 ERROR creating sql view model transformed_data.stg_purchases ............ [ERROR in 0.66s]
18:06:35 3 of 6 START sql view model transformed_data.stg_users ......................... [RUN]
18:06:35 3 of 6 ERROR creating sql view model transformed_data.stg_users ................ [ERROR in 0.65s]
18:06:35 4 of 6 SKIP relation transformed_data.product_popularity ....................... [SKIP]
18:06:35 5 of 6 SKIP relation transformed_data.purchase_patterns ........................ [SKIP]
18:06:35 6 of 6 SKIP relation transformed_data.user_demographics ........................ [SKIP]
18:06:35
18:06:35 Finished running 6 view models in 0 hours 0 minutes and 3.06 seconds (3.06s).
18:06:35
18:06:35 Completed with 3 errors and 0 warnings:
18:06:35
18:06:35 Runtime Error in model stg_products (models/staging/stg_products.sql)
404 Not found: Table bigquery-403520:raw_data.products was not found in location US

Thanks!

@CodeResolver does the raw_data.products actually exist? If so, is the location of the raw_data dataset "US"? Airbyte should have created the raw_data.products in BigQuery. You may need to run an Airbyte sync before attempting running the dbt models.

Note to self: Add that as a note in all quickstarts.

Hi @ThaliaBarrera, just did a sync but still getting some issues, also I dont see the tables created in Bigquery but I do see the connections in Airbyte. Please check this gist with my current setup:

I put them all together for quicker reference:
https://gist.github.com/CodeResolver/3c76736ef66b1c9bd74e8c894fcf575b

Also workspace_id was a bit tricky to find, please check if that is correct (got it from my local Airbyte instance url) and maybe consider adding a comment for that one to quickstarts.

Thanks!

@CodeResolver Hello from your code I can see that you are using faker source . So I assume that first you are experimenting with the Ecommerce analytics quickstart and havent yet started with your actual source which is poke API. Also I couldnt find the faker _source.yml and stg_products and other files code in your given code which are the key files to create the stg_products and other tables under transformed_data in BigQuery.

and for your workspace id yes you can find that in the local instances url its like workspaces/************/connections .
there is a communication gap also as you are almost 10hr behind from Indian Standard Time otheer wise I could have helped more. Ty :)

Hi @bishalbera thanks, yes I am trying to make this one work before I tackle the Pokeapi, I do see those 2 files you mentioned, I just added mine to the end of the gist, please take a look.

@CodeResolver ok ..It will be nice if you could come in dm like mail or twitter(if you are ok with it) as here it will be very long conversation and I dont know if that will be ok or not .

@CodeResolver ok now from your updated code I can see that you have set the faker_source and stg_products. now if you run dbt run it should run successfully if you have set the connection successfully

Hey @bishalbera, sure we can do discord, here is a link to a server I just made for this:
https://discord.gg/SMukNdGM