A survey of data scientists reveals a field of great opportunities but also room for improvement.
What’s new: The 2022 “State of Data Science” report from Anaconda, maker of a popular Python distribution, surveyed 3,493 students, teachers, and employees in data science, machine learning, and AI about their work and opinions of the field.
Who they surveyed: The poll reached data scientists in 133 countries (40 percent in the U.S. or Canada). 76 percent were men, 23 percent women, and 2 percent nonbinary. 80 percent had at least an undergraduate-level degree. The majority — 55 percent — worked for firms with 1,000 or fewer employees, while 15 percent worked for companies with over 10,000 employees.
State of the field: Participants were asked to rate various aspects of their day-to-day work and share their hopes for the future. They expressed widespread satisfaction but expressed worries about the field’s potential for harm.
- On the job, 70 percent of respondents reported being at least moderately satisfied. Professors, instructors, and teachers reported the highest levels of job satisfaction.
- Respondents spent an average of 51 percent of their time at work preparing, cleansing, or visualizing data and 18 percent selecting and training models.
- Of those who deployed models, 60 percent deployed them on-premises, while 40 percent deployed them in the cloud.
- Most respondents preferred to program in Python, and 31 percent used it every day. 16 percent used SQL daily. Single-digit percentages were daily users of other languages including C/C++, Java, and Rust.
- Of the students surveyed, 27 percent hoped to work for a well-established startup, 23 percent for an industry giant, and 22 percent for an academic institution or research lab.
Challenges: Respondents also answered questions about challenges they face, and those faced by data science at large:
- Many of those surveyed felt their organizations could do more to support them in their work. The biggest barriers were under-investment (65 percent), insufficient access to talent (56 percent), and unrealistic expectations (43 percent).
- Students noted obstacles in finding internships (27 percent), job listings that weren’t clear about the qualifications required (20 percent), and lack of a professional network or mentoring (15 percent).
- 62 percent said their organizations were at least moderately affected by a scarcity of skilled workers. Those who were employed cited a dearth of talent in engineering (38 percent) and probability and statistics (33 percent).
- 32 percent said the biggest problem in the field was the social impact of bias, followed by data privacy (18 percent) and “advanced information warfare” (16 percent).
Behind the news: The U.S. Bureau of Labor Statistics forecasts that the number of computer and information research scientists will grow by 21 percent between 2021 and 2031 — far higher than the 5 percent average across all industries. Anecdotal evidence suggests that demand for skilled AI professionals already outstrips supply.
Why it matters: It’s great to hear that data science rates highly in both job satisfaction and market demand. The areas in which respondents expressed a desire for improvement — bias, privacy, the dearth of skilled engineers — suggest possible avenues for career development.
We’re thinking: Given that preparing, cleansing, and visualizing data takes up 51 percent of time spent on data science, and selecting and training models occupies only 18 percent, it appears that most practitioners already do data-centric AI development. They just need better principles and tools to help them do this work more efficiently!