We define "datathon" very broadly:
A datathon is a hackathon using data. Hacking is creative problem solving. (It does not have to involve technology.) A datathon is any event of any duration where people come together to solve problems. Most datathons we run also have a parallel track for courses and workshops. Participants typically form groups of about 2-6 individuals, take out their laptops (if the event is technology themed), and dive into problems. Training workshops are a great parallel track especially for newcomers but also for all participants. Most datathons are 1 or 2 days long and participants are together in one location. 1 Million Women To Tech's datathon is longer (~3 month) and participants can form remote groups, which gives them great practice working in a professional environment in a distributed workforce.
Datathons have gotten a bad rap because of some that have an unhealthy, competitive structure, and for setting unrealistic expectations. Don't participate in a datathon with that mindset and you'll be on the right track. Here are the goals to keep in mind:
Be welcoming to newcomers to the community. Provide an opportunity for participants to learn something new. Provide a space and a time for participants to make headway on problems they are interested in. Don't expect to have actually solved a problem by the end of the datathon. Real life problems are hard! Think of the datathon as a pit-stop on a long journey to solve problems or as a training session to prepare participants for solving problems.
Since you're not going to solve a problem, don't put unrealistic (and unhealthy) pressure on yourself and your teammates. Don't stay up all night, don't pump yourself with caffeine, and don't make winners and losers. Just don't. Participants should come energized and be greeted with positive energy.
The hardest thing about running a successful datathon is being welcoming to newcomers and helping them get involved in an activity.
Newcomers often suffer from "imposter syndrome", the feeling that they don't belong because they don't have skills, aren't smart enough, etc. They're wrong, of course, but until they feel like they belong they will not be able to have a fulfilling experience. It is the datathon organizers', the mentors' and the more experienced participants' job to help them realize they have something to contribute.
First time datathon participants are often overwhelmed when it comes time to finding a project to work on. They may not yet know how to relate their own skills to the sorts of projects being worked on. Knowing how to be useful is a skill in itself. You will need to guide them to a project and through a process for them to realize how they can contribute. If you have too many lost participants and not enough help in getting them started on a project, they will leave — let's try to avoid that.
Mentors and Mothers of Community must make sure that everyone has something to do. One way to do this is to have a list of project leaders ahead of time: people you know are coming with particular projects that you can guide other participants to. And you can work to make sure your hacking projects are ready to accept newcomers. We also hold non-project activities — courses, described below — which are easier for newcomers to join.
Goal: Have Project Leaders publish their project details on the DIY forum: https://1millionwomentotech.com/groups/diy/forum/topic/datathon-project-briefs/
Each person posts their personal intro to the group(s) they are interested in working with.
Helpful details in your post:
- Name
- Discord username
- Job title
- Are you new to datathons?
- What kind of hacker are you? Examples: Developer. Designer. Data Scientist. Domain Expert. Government Staff. Communicator. Project Manager. Advocate. What are you interested in hacking on? (free form question)
- Which datathons and courses are you interested in?
- How you heard about the event
- Special needs/requests
The hacking track is for participants to dive into problems. Often groups of 2-6 individuals form around a project, such as building a new data visualization, writing a document, or collaboratively investigating a problem. Participants take out their laptops, connect via a Discord group video call, or join another a video conference and get working.
Hacking begins with project introductions. Participants that bring projects to the event have an opportunity to briefly (1 minute max) explain what they are working on during the Week 2 Mission Update on Fri Jan 25, 2019 at 10am Pacific Time so that other participants can join that project. At the end of the event, a wrap-up session gives each project a chance to demonstrate some accomplishments.
Not every project makes a good datathon project. It is extremely important to maximize the following qualities in the projects at your event:
- Clearly articulated. Projects should have a clear question or problem they are trying to solve plus a reasonably specific proposed solution.
- Attainable. Most projects will accomplish about 25% of what they think they can accomplish in the limited time they have. Manage each project's goals so participants are able to feel accomplished at the end of the session, not interrupted.
- Easy to onboard newcomers. Projects should have ready-to-go tasks for newcomers with a variety of skills and at a variety of skill levels. For coding projects, these tasks can't require an intimate understanding of the code base, and make sure the build environment can be spun up in less than 20 minutes. Make a list of tasks or create github issues ahead of time!
- Led by a stakeholder. A stakeholder (or "subject matter expert") guides a project to real-world relevance. Projects without a stakeholder can "solve" a problem that doesn't exist. Ideally the leader (or one of the leaders) is a stakeholder, or a good proxy for a stakeholder. I strongly recommend reviewing Laurenellen McCann's Build With, Not For series on involving stakeholders in all civic tech work. Additionally, it is never enough for a project leader to just be an ideas person. Beware when the leader is a stakeholder but can't foresee how he or she might be implementing along with the rest of the team.
- Organized. For projects with four or more members, especially newcomers, the project leader's role should be to coordinate, ensuring each team member has something to work on and helping to welcome new team members.
Treat these bullets like a checklist. Projects that think about themselves in terms of these qualities tend to be happier and more productive.
If you know what projects are going to be worked on at the event, the earlier you can get those projects thinking about this the better. Meet with project leads and talk about these components of their project ahead of time if possible. As a Mentor, having this information about projects can also help you route participants to projects they may want to work on.
A great resource to read is the Google Summer of Code Guide on Defining a Project (Ideas List) https://google.github.io/gsocguides/mentor/defining-a-project-ideas-list
A themed datathon is one in which the projects are confined to a particular problem: such as digital humanities or community engagement. Themed datathons are able to attract subject matter experts (something that open-ended datathons are not good at), and projects typically revolve around problems that the subject matter experts bring to the table.
When themed datathons are also technology datathons, there is a common problem: Subject matter experts can readily identify problems in their field but cannot always turn those problems into workable technology projects. Other participants may be ready to apply their skills but not know anything about the datathon's theme. Bridging that gap requires careful moderation and support from Mentors.
What often results is a division of the participants into three groups:
- Subject matter experts and other participants successfully working together.
- Subject matter experts working with other subject matter experts on problem investigation but not implementation.
- Other participants struggling to find something relevant to work on / implementing a solution of minimal value to solving the theme's actual problems.
'#1 is great. #2 is fine if the group is happy. But #3 is bad: participants without subject matter guidance will feel lost. To avoid this, Mentors will be asked to check-in with participants and assign them to workable projects until everyone is onboarded to a project. (*Due to shortage of womanpower we can only do this for VIP and Gold Members, free DIY-ers will have to, well, do-it-yourself ;) )
Work with the subject matter experts at the beginning of the event to turn their problems into projects. See the section Cultivating Good Projects above to ensure there is a coherent question, that the necessary resources exist (e.g. datasets), and that the skills needed for the project match the skills expected to be brought by other participants (and in sufficient quantity).
Additionally, a subject matter expert may propose many ideas but she can only effectively participate in a single project during the event, so ensure that there is at least one subject matter expert + workable project for about every five non-expert participants.
The subject of the Winter Of Data 2019 Datahon is the Gender Gap. You will be using the data sources listed by the World Economic Forum, but you are welcome also to bring in other datasets that you can find.
Step 1: Get familiar with the Global Gender Gap Report, the newest edition is the 2017 one.
The Global Gender Gap Index examines the gap between men and women in four fundamental categories (subindexes) and 14 different indicators that compose them. The subindexes are Economic Participation and Opportunity, Educational Attainment, Health and Survival and Political Empowerment. The highest possible score is 1 (equality or parity) and the lowest possible score is 0 (inequality). There are three basic concepts underlying the Global Gender Gap Index, forming the basis of how indicators were chosen, how the data is treated and the scale used. First, the Index focuses on measuring gaps rather than levels. Second, it captures gaps in outcome variables rather than gaps in input variables. Third, it ranks countries according to gender equality rather than women s empowerment.
Step 2: Review available data sources.
- https://ourworldindata.org/economic-inequality-by-gender#data-hubs-dedicated-to-gender-statistics
- https://knoema.com/atlas/topics/Demographics/datasets
- https://ucsd.libguides.com/data-statistics/gender
You are welcome to find or bring your own datasets.
IMPORTANT: You must have the rights to those datasets and be allowed to publish the results.
Step 3:
Your challenge, should you choose to accept it, is to create your own indicator (the 15th one ;) ).
Intermediate challenge: choose a specific country and predict when the economic gender pay gap will be closed (year).
Beginner challenge: use the Summer Of Code 2018 data sets to plot and analyze the preregistrations by date and by country.
Each datathon we run at 1millionwomentotech has an open ended category to allow for those with pre-existing ideas to work on them and attract teammates to them. It is the Stakeholder's responsiblity to ensure that the project is explained, there is a clear project question, and that resources are available e.g. datasets.
Onboarding participants onto existing projects can be very difficult. It is one of the hardest parts of hacking. Mentors should look for ideas for new projects that are especially easy for participants to get started with if they can't join an existing project.
Having project ideas ready is especially important as we do not know how many participants will bring projects! And always be open to project ideas from participants. A project of one, meaning someone working alone, is okay too!
To help complete newcomers, we are including two small .csv (comma seperated values) files that include the Summer Of Code partipant numbers by date, and by country. The questions are very specific, and with a bit of self-study most beginners should be able to complete it.
Do not allow anyone to pitch an idea that they will not be working on at the event, unless there really are not enough ideas to go around. Otherwise, this is a waste of everyone's valuable time.
Once hacking has begun, do not interrupt the hackers except to ensure that the hacking is going smoothly, to check that everyone has something to do, and to keep people on the overall schedule.
A successful datathon might be just hacking, just training, or both hacking and training.
As we expect a significant number of newcomers, having training workshops is a great way to give them something to do that they will be more comfortable with than diving into hacking.
We run weekly courses to introduce participants to a particular technical skill useful for data science, and the daily data course releases a new lesson every weekday.
Courses are interactive as much as possible, with Mentors available to answer questions over Discord and Forum. For Course syllabi and schedule see https://1millionwomentotech.com/winter-of-data-2019 > Courses.
For large events like this, we need sponsors to help us cover the costs. At the moment we are $630k short of being able to deliver all Summer Of Code, and Winter Of Data to 1 million participants.
Sponsors will give something — cash, prizes, t-shirts — with the expectation that they get something out of their support for your event. They might be recruiting/hiring and are looking to scout out our attendees, or they might be marketing a product that they want to promote.
We will decide on a case-by-case basis what we are willing to give sponsors in return for their support. We will certainly thank our sponsors, by name, during your opening and closing session, and will probably tweet/etc. our thanks too.
Beyond that, if they have a great female speaker we may give them a time at on the digital podium to speak to our participants. Or a page in the online forum to show off their offering. We have to strike the right balance between bringing in enough sponsorships with not interfering with the goals of the event.
If you are interested in sponsoring or know a company who might, please consult the https://1millionwomentotech.com/sponsorshipdeck and get in touch!
Technology events have a history of not always being welcoming to women and minorities. We need to change that. You can be a part of that change by being part of 1 Million Women To Tech and spreading the word. We also have a zero-tolerance harassment policy and a code of conduct for the event. A code of conduct is not just about enforcing rules. It sets community norms and sends a signal to would-be participants that you are trying to create a welcoming environment. And, of course, if there is a problem having a code of conduct ahead of time helps you resolve the issue.
We have adopted the MLH Code of Conduct, please review it.
Mothers of Community sought to hold happy hours throughout the week to helps participants to get to know each other in a relaxing setting. Happy hours happen on Zoom: up to 100 participants, where we are able to place participants in 25 Break Out Rooms (4 person each).
Official communications are only through the 1millionwomentotech DIY Forum and Discord.
-
VIP and Gold members can get support via Kartra Helpdesk for now, a full featured helpdesk system is coming the the new membership site, we are working on it heavily.
-
Free DIYers please post your question into the FAQ https://1millionwomentotech.com/wod2019/faq
Anyone who has brought a project to work on should introduce the project to everyone during the Mission Update (Fridays at 10am). This is sometimes called "project pitches." Keep each pitch short: the leader's name and affiliation, a problem statement, the solution, and the skills/help needed. Project leaders tend to talk for as long as they can, so we may need to cut them off after one minute to be respectful of the audience's time. We encourage leaders to think of this not as recruiting but as boasting how awesome their days are going to be.
Mothers of Community are managing the Discord channels. MOCs should go around to check that every project is going smoothly. See if anyone needs anything or can't find something to work on. Keep people on the overall schedule. Alert everyone when it is time for Mission Update and deadline for submission. Leading up to wrap-up, make sure each project is prepared to explain what they did. Get them to record their progress on the forums.
The wrap-up session gives everyone a chance to hear what everyone else worked on during the datathon. For participants we ask you to report what you accomplished or what they learned (especially for course participants). Give folks rounds of applause.
Keep each project to 1 minute, and if they are going to show something on the screen make sure it is ready before the wrap-up session begins. Links via URLS, and demos as prerecorded YouTube videos should be added to the GitHub submission.
Opening: Monday Jan 21, 2019.
Daily Data: Mondays - Friday at 6:00am-6:02am on 1millionwomentotech Facebook and YouTube, and one new lesson released on 1millionwomentotech.com each day
All times are Pacific Time (PST and PDT) - be careful of the time change.
Credit: This guide is based on the Summer Of Code Hackathon Guide.
- create 1millionwomentotech profile at https://1millionwomentotech.com/request-invite
- read https://1millionwomentotech.com/programs/winter-of-data-2019/
- watch onboarding video
- subscribe to the [DIY] 1MWTT Google calendar
- post project brief to https://1millionwomentotech.com/groups/diy/forum/topic/datathon-project-briefs/ if you wish to lead, or read the projects if you wish to join a team
- join Week 2 Mission Update on Fri Jan 25, 2019 at 10am Pacific Time](https://www.youtube.com/watch?v=nWWhnQar0H8) to present or listen to project briefs
- form teams
- get data from 1mwtt GitHub Classroom https://classroom.github.com/g/FoBcqQwK