Streaming Pipelines

Homework for Week 5 Streaming Pipelines

Submission Guidelines

To ensure smooth processing of your submissions through GitHub Classroom automation, we cannot accommodate individual requests for changes. Therefore, please read all instructions carefully.

Sync fork: Ensure your fork is up-to-date with the upstream repository. You can do this through the GitHub UI or by running the following commands in your local forked repository:
```
git pull upstream main --rebase
git push origin <main/your_homework_repo> -f
```
Open a PR in the upstream repository to submit your work:
- Open a Pull Request (PR) to merge changes from your main or custom homework branch in your forked repository into the main branch in the upstream repository, i.e., the repository your fork was created from.
- Ensure your PR is opened correctly to trigger the GitHub workflow. Submitting to the wrong branch or repository will cause the GitHub action to fail.
- See the Steps to open a PR section below if you need help with this.
Complete assignment prompts: Write your query files corresponding to the prompt number in the submission folder. Do not change or rename these files!
Lint your SQL code for readability. Ensure your code is clean and easy to follow.
Add comments to your queries. Use the -- syntax to explain each step and help the reviewer understand your thought process.

A link to your PR will be shared with our TA team after the homework deadline.

Steps to open a PR

Go to the upstream streaming-pipelines repository
Click the Pull Requests tab.
Click the "New pull request" button on the top-right. This will take you to the "Compare changes" page.
Click the "compare across forks" link in the text.
Leave the base repository as is. For the "head repository", select your forked repository and then the name of the branch you want to compare from your forked repo.
Click the "Create pull request" button to open the PR

Grading

Grades are pass or fail, used solely for certification.
Final grades will be submitted by a TA after the deadline.
An approved PR means a Pass grade. If changes are requested, the grade will be marked as Fail.
Reviewers may provide comments or suggestions with requested changes. These are optional and intended for your benefit. Any changes you make in response will not be re-reviewed.

Assignment

Your task is to write a streaming pipeline to understand DataExpert.io user behavior using sessionization. Use the connectors and examples in the airflow-dbt-project repo to get started!

Assignment Tasks