Responsibilities include, but not limited to:
• Strong desire to grow a career as a Data Scientist in highly automated industrial manufacturing doing analysis and machine learning on terabytes and petabytes of diverse datasets.
• Experience in the areas: statistical modeling, feature extraction and analysis, supervised/unsupervised/semi-supervised learning. Exposure to the semiconductor industry is a plus but not a requirement.
• Ability to extract data from different databases via SQL and other query languages and applying data cleansing, outlier identification, and missing data techniques.
• Strong software development skills.
• Strong verbal and written communication skills.
• Experience with or desire to learn:
• Machine learning and other advanced analytical methods
• Fluency in Python and/or R
• pySpark and/or SparkR and/or SparklyR
• Hadoop (Hive, Spark, HBase)
• Teradata and/or another SQL databases
• Tensorflow, and/or other statistical software including scripting capability for automating analyses
• SSIS, ETL
• Experience working with time-series data, images, semi-supervised learning, and data with frequently changing distributions is a plus
• Experience working with Manufacturing Execution Systems (MES) is a plus
• Existing papers from CVPR, NIPS, ICML, KDD, and other key conferences are plus, but this is not a research position