The SoCal RUG Hackathon 2023-04, hosted by the Southern California R Users Group and the UCI Paul Merage School of Business. It is a two-day event where you will "hack" a data set for fun, education, and prizes. The focus of the event is on education and teamwork, with the main goal of taking a data set from its "raw" form all the way through to a final "product" (e.g. visualization, model, insight). To frame this process, the event will have a competitive aspect where teams will present their work at end of the event to a panel of judges, with prizes awarded in several categories (see below).
The event will start with a series of practical educational tutorials to get you started with the fundamentals of data analysis using the R programming language. This will be followed by working sessions where teams will explore and analyze the data set in preparation for the presentations. Participants will work in small teams (2 - 5 people). Teams can either be pre-arranged by participants themselves or will be assigned at the start of the event.
This event is open to data scientists, enthusiasts and hackers of all levels, from the beginner to the highly experienced. If you are a beginner, it may be helpful to do some preparatory learning before the event — see the suggested resources below. If you are an experienced user, we look forward to you sharing your expertise with others. Assisting others, both within and between teams, is highly encouraged.
During this event you can use any tool that you want. You are not limited to R. It is common for data teams to use a mixture of tools such as programming languages, interactive data visualization software, data analytics, reporting, and integration tools. Putting the data into a database and using SQL based tooling has been an effective approach.
- The hackathon is primarily an educational event, not a competition. However, the hackathon is framed in the context of a competition to provide overall structure. It includes team-based collaboration to create a presentation that will be judged by a panel and prizes will be awarded.
- Novice Users: provides an opportunity to work with real-world data sets from start (acquire the data) to finish (produce a final “report” on the findings from their work).
- Experienced Users: provide an opportunity to practice data analysis skills in a structured environment, interact with others, and assist people that are new to analyzing data.
When: April 22-23, 2023
- Saturday: 9:00 AM - 10:00 PM
- Sunday: 8:30 AM - 4:00 PM
Where: University of California, Irvine -- Paul Merage School of Business
-
Directions & Parking Information. You pay for parking at the kiosk at the entrance. If you go to the second page on the kiosk menu there is an option to get a daily pass. This is much cheaper than paying by the hour. The cost for a daily pass is about $13.00.
-
Rooms (tentative)
- SB1 2100 - Main event room
- SB1 2009 - break out room - meeting room
- SB1 2011 - break out room - meeting room
- SB1 2013 - break out room - meeting room
- SB1 2015 - break out room - meeting room
- SB1 2017 - break out room - meeting room
- SB1 2019 - break out room - meeting room
- SB1 3100 - break out room - meeting room
- SB1 3104 - break out room - meeting room
- SB1 3107 - break out room - meeting room
- SB1 4101 - break out room - meeting room
- SB1 3rd floor patio
Registration
- Cost: $35
- Register through EventBright
Time | Event |
---|---|
09:00 AM | Registration starts |
09:00 AM - 9:30 AM | Light Breakfast, Set-up, Tech Support |
09:30 AM – 10:15 AM | Tutorial: Intro to Text Analysis in R - Emil Hvitfeldt |
10:15 AM - 10:30 AM | Break |
10:30 AM - 11:15 AM | Tutorial: Applications of Text Analysis in R - Emil Hvitfeldt |
11:15 AM - 11:30 AM | Break |
11:30 PM – 12:00 PM | Morning Wrap-up, Questions, Team Formation |
12:00 PM – 1:00 PM | Lunch |
01:00 PM | Registration Closes |
01:00 PM – 01:30 PM | Welcome, Data Set Overview |
01:30 PM – 05:30 PM | Working Session |
05:30 PM – 06:30 PM | Dinner |
06:30 PM – 07:30 PM | Evening Data Challenge |
07:30 PM - 10:00 PM | Working Session |
10:00 PM | Building Automatically Locks |
Time | Event |
---|---|
08:30 AM | Doors Open |
~12:00 PM | Working Session |
12:00 PM | Presentation Submission Deadline |
12:00 PM – 1:00 PM | Lunch |
1:00 PM – 1:30 PM | Presentation Prep |
1:30 PM – 3:00 PM | Group presentations |
3:00 PM – 3:30 PM | Judges Deliberate; Complete Event Survey |
3:30 PM – 4:00 PM | Award Presentation & Wrap-Up |
Note The building is locked all day Sunday. If you leave during the event, please arrange to have someone open the door for you. You can also text John Peach at 802 735 2059 to let you in.
- All participants must register for the event and have a valid ticket to attend.
- All participants must abide by the SoCal RUG Code of Conduct, including the R Consortium and the R Community Code of Conduct.
- Participants are free to come and go during the event. However, any participant who has not checked in by 01:00 PM on Saturday will be considered a "no-show" and their spot may be given to someone else.
- Though this is an R-focused event, participants are free to use any programming language or tool for their work.
- Participants are free to work on their projects both on-site at the hackathon and off-site. It is highly encouraged that participants attend all working sessions to maximize team and group interactions.
- Final submissions must only have content that is a product of work performed during the event. Please do not use any previous work you or others may have produced as part of team submissions.
- Connect to SSID: UCI Guest
- Agree to the terms of service.
If you are at the Paul Merage Business School you can get help at: Merage Technology Support
4293 Pereira Dr
SB1 - Suite 2400
Irvine, CA 92697
(949)824-4357
Otherwise call the OIT support line at (949) 824-2222 option 3
SoCal RUG GitHub Repo: https://github.com/socalrug/
Please install git and clone the following repo before the event and pull before the start of the event.
command:
git clone git@github.com:socalrug/hackathon-2023-04.git
You can access the hackathon repo with a web browser https://github.com/socalrug/hackathon-2023-04, however, it is strongly recommended that you clone
the repo as it will frequently be updated throughout the event.
All the tutorial material can be found at the following website https://emilhvitfeldt.github.io/tutorial-ocrug-hackathon-nlp/.
A slack channel has been set up for the hackathon. This will be used for general announcements but it is also a great source for you to ask questions to other participants.
If you have not created an account on our slack group, create one using the following link:
Slack Group Sign-up: https://tinyurl.com/socalrug-slack-invite3
Once you have an account, sign in (you can do it on a web browser or download an app on your phone or desktop).
Slack channel: https://socalrug.slack.com
The channel for the hackathon is hackathon-2023-04
- All participants will work in teams of between 2 and 5 people.
- Participants are free to form their own teams prior to the event.
- We will assist in team formation at the beginning of the event for any participants that do not already have a team.
- Teams will select a team name.
- Assisting others within and between teams is highly encouraged.
See the presentation guidelines for the requirements. The team prizes will be determined by a panel of judges using the following judging guidelines. The judge's decision is final.
Below is a list of the awards and prizes. Winners will be able to select from the available assortment of prizes.
- Best Presentation
- Best Analysis
- Best Insight
- Best Visualization
On Saturday evening, we will have an hour-long data challenge event, the idea is to take a break from the main hackathon work, meet people and have some fun. This will be an opportunity to interact with other participants outside of your team, practice your data hacking skills on a new data set, and win prizes.
After the event, please complete the SoCal RUG Hackathon 2023-04 Survey. This is important feedback so that we can constantly improve our events.
-
- 1-page note sheets covering data science fundamentals and useful R packages.
-
- Comprehensive book on the complete data science workflow, including data importing/cleaning, visualization, and data analysis
- Focus on
tidyverse
packages - Accessible for beginners who have a basic grasp of R
-
- This is the hub website for the core
tidyverse
packages - Check out the Packages section and associated links for helpful information on using the packages.
- This is the hub website for the core
-
- This book digs into the details of R.
- A great resource for more advanced users wanting to learn more about R under the hood.
- There is also a 1st Edition of the book.
-
- Useful when you need to lookup more info on specific geoms, stats, scales, etc.
- Check out the examples in the details pages for each function.
-
- Gallery of various types of charts and the code needed to create them.
-
Mistakes, we’ve drawn a few: Learning from our errors in data visualisation
- From the Economist about mistakes they've made with published data visualizations, and how they'd fix the problems.
- Note: even professionals make mistakes too!
-
DALEX R Package -- Descriptive mAchine Learning EXplanations
- Provides a set of tools that help you to understand how complex models are working
- Helps you visualize what's going on
- Check out the cheatsheet
Food, drinks and snacks will be provided throughout the event. We will have vegetarian options available. Please feel free to bring any additional food for yourself if you would like to supplement the meals or if you have other specific dietary constraints.
- Saturday
- Lunch: Tacos, Chips, Rice & Beans
- Dinner: Chicken & Falafel Bowls
- Sunday
- Lunch: Sandwich Boxed Lunches. There will be a limited number of gluten free meals.
- Snacks and Drinks
- Coffee
- Various teas
- Soft drinks
- Water
- Various snacks