Even more concretely, work with a non-profit organization (or another organization that doesn’t have the economic power to hire programmers or data scientists) to create a project that is meaningful for the organization and also shows off your skills. It’s a great way to do demonstrative and meaningful work while also aiding an organization that could use your help, and likely has problems people are paying attention to solving. Win-win.
Current Value vs Potential: Look for companies that will hire you for your potential. It’s important to be upfront about your grit, self-sufficiency, and ability to hit the ground running.
Luckily, with disciplines like data science, the market is on your side.
Sometimes companies can spring for a Junior Data Scientist and invest in your growth, which is really what you wanted from the beginning.
Everyone will tell you this, but I work on product so I’ll underline it even more strongly: Learn to write production-level code. The more technical you are, the more valuable you are. Being able to write production code makes you imminently hirable and trainable.
[*NB: Don’t think for a minute that I don’t believe in the tenets of a true liberal education - quite the contrary. I continue to read philosophy and history, in part because we cannot draw fully upon the knowledge of man without doing so. These are essential elements to being a purposed, ethical, and effective person - but they don’t directly accelerate a career. The true liberal education has nothing to do with market forces, and never should. Higher Education as it exists today and Liberal Education should be held as wholly uniquely-motivated institutions.]
How was your self-taught path to becoming a data scientist received by company recruiters? What advice would you share with entrepreneurial individuals who are interested in the field?
Talk with people who can recognize hustle and grit, and not necessarily those who are looking to match a pattern drawn from your previous experience. Often, these kinds of people run startups.
Recruiters gave me a very real response: They didn’t see my course of self-study as legitimate. It’s hard to give yourself a stamp of approval and be taken seriously. I wouldn’t recommend that just anyone do what I did — it will take a while for autodidactism to Talk with people who can recognize hustle and grit, and not necessarily those who are looking to match a pattern drawn from your previous experience.
63
become more accepted, and maybe it will never be a primary pattern. But maybe people like me can help expose this as a viable way to advance professionally. I know that great companies like Coursera will continue to innovate on these new forms of education, keep quality high, and democratize access.
tl;dr
If you want to get to the next level, wherever your next level may be, it’s possible to pave your own road that leads you there. It’s a monstrously tough road, but it’s your road.
CLARE CORTHELL
DREW CONWAY Head of Data at Project Florida
Your data science Venn diagram has been widely shared and has really helped many people get an initial sense of what data science is. You created it a long time ago, back in 2010. If you had the chance to create it again today, would you change any part of it?
Quite a lot. I can speak a little bit about the history of it which I think is probably less glorious than people know.
I was a graduate student at NYU and was a teaching assistant for an undergraduate class in Comparative Politics. As a teaching assistant in those classes, your mind wanders because you already know the material.
It was 2010, and the idea of data science was much more primordial. People had less of a sense of what data science was. At that time I was thinking about the definition of data science. I had been speaking to people like Mike Dewar, Hilary Mason and some other people in New York and was influenced by their ideas and some of my own and came up with the definition while sitting there in class.
The original Venn diagram I made on data science, which ended up becoming quite well- known, was drawn using GIMP as the editor — the simplest, cheapest program in the world. But I’m very happy that it seems people have attached themselves to it and it make sense to them.
After graduating with degrees in both computer science and political science, Drew found himself working at the intersection of both fields as an analyst in the U.S. intelligence community, where he tried to mathematically model the networks of terrorist organizations.
After spending a few years in DC, Drew enrolled in a political science PhD at New York University. It was here that he drew up his famous Data Science Venn Diagram. It was also during this time that he co-founded Data Kind, a nonprofit organization which connects data experts with those who need help. After a stint at IA Ventures as their Data Scientist in Residence, Drew joined Project Florida as Head of Data, where he uses data science to give individuals better insights into their health.
Drew is also the co-author of the O’Reilly book, Machine Learning for Hackers.
Human Problems Won’t Be Solved by Root Mean-Squared Error
65
What has become more apparent to me as the years have passed is that the thing missing from it is the ability to convey a finding, or relevant information once an analysis is complete, to a non-technical audience. A large amount of the hard work that most data scientists do is not necessarily all data wrangling and modeling and coding. Instead, once you have a result, it’s about figuring out how to explain that result to people who are not necessarily technical or who are either making business decisions or making engineering decisions.
Really, it’s all about conveying a finding. You can use words to do that, you can use visualization to do that, or you can develop a presentation to do it. A well- rounded data science team will have someone who is very competent at this. If your organization is making decisions
based on your analysis, you need to be sure they understand why.
This echoes parts of what we’ve heard when we talked with Hilary Mason and Mike Dewar. Both of them emphasized the storytelling part and how to carefully communicate the analysis part.
It’s something that receives the least amount of thought, but turns out to be one of the most important things once you’re doing this in the wild. Even the people who have had success in data science up to this point have just been naturally good at it, whether they were blogging about it or giving good presentations. Both Mike and Hilary are examples of people who are good at doing that. They are naturally good at it. People who are not naturally good at it can learn about it through coaching, and mentorship.
In just the same way, if you’re not a good coder you can become a better coder through coaching and mentorship.
You said on a Strata panel: “Human problems won’t be solved by root mean square error.” What did you mean by that?
I think when people think about data science, or even machine learning applied to data science, people think that we have a well-defined problem, and we have our data set. We need to find a way of taking that problem and that data set and producing an answer that is better than the one that we currently have.