Challenging the myth of the Data Scientist Unicorn


Data scientists are often described as unicorns, creating an analogy between the rare and magical creatures and the unique mix of skills that makes good data scientists difficult to find.  However, we challenge this premise at D2D CRC and prefer to think about data science as a set of related roles and competencies that include data scientists, data engineers and data analysts.  We believe this team perspective is more useful because it enables organisations and individuals to gain clarity about workforce requirements and development needs.

Creating a Data Science Competency Framework

We also noticed that there is currently a lack of definition around data science competencies in Australia, so through a process of consultation we developed the Data Science Competency Framework.  The framework provides a set of 20 competencies for data science, data engineering and data analytics roles across four different maturity levels.  The competencies can be utilised for multiple purposes including to support role clarity, recruitment processes and career pathways, or for recognising people’s skills, experience and achievements.  They can also be utilised to identify individual and team strengths and areas for development, particularly with the help of the Development Planning Tool, which enables individuals to undertake a self-assessment process.

Some of the 20 competencies are:

  • Project Scoping
  • Data Sources Identification
  • Data Audit
  • Create Models
  • Present to Stakeholders
  • Data Pre-Processing

Each competency is defined at each maturity level and through the Development Planning Tool individuals are able to consider how much they can demonstrate each competency.

For example:

Data Pre-Processing at the Senior Level

“Creates the required data set utilising an understanding of complex problems, data formats and desired modelling technique.  Ability to fuse data sources using knowledge of a range of data pre-processing techniques such as transformation, integration, normalisation, feature extraction, to identify and apply appropriate methods”

If development is required for this competency, the Development Planning Tool suggests a range of options such as short-courses and online courses provided by reputable educators to on-the-job learning like joining a Meet-up or finding a mentor.

Improving the Data Science Competency Framework

To help ensure we are on the right track, and because we practice Agile, we recently piloted the Development Planning Tool within five organisations, including the D2D CRC.  Approximately seventy people completed the self-assessment and then gave us feedback to help improve the Tool and Framework even further.  Key enhancements will include:

  • Incorporating more practical details into the competencies, including techniques and methodologies, with examples where possible
  • Expanding on the data analytics oriented competencies
  • Incorporating more technical skills, experience and abilities at senior levels, with less focus on management factors
  • Identifying additional development options that relate to specific competencies and can be undertaken quickly

While the Framework and Tool are updated, we hope to continue the critical conversations about data science competencies and their development.  If you are interested in finding out more about the Data Science Competency Framework or Development Planning Tool, please email me (Megan Prideaux, Education and Training Manager) via [email protected]