.Join the team remotely for San Francisco-based company from Mexico!An exclusive opportunity to work directly with a cutting-edge data service technology company, directly with the SourceScrub team.**What is SourceScrub?**SourceScrub is the world's leading data service for firms looking to research, find and connect with privately held companies. Our Private Company Intelligence platform allows deal teams to take a data-driven approach in a traditionally opaque segment of the market. We combine state-of-the-art technology with an unmatched QA process resulting in the freshest and most accurate data set available.SourceScrub provides research and information management systems that incorporate thousands of online sources from trade show exhibitor lists to industry buyers' guides. SourceScrub is working 24/7 to ensure we offer accurate and current data for prospecting investment opportunities. Our sourcing data and prospecting tools promise to save time and increase deal flow.**What You'll Do**You will write infrastructure as code, tooling, and automation. You will promote DevOps workflows, processes, and best practices. You will design ingestion pipelines.**_Tech we use_**- Languages: Python, bash, etc.- Operating systems: Windows, AWS Linux (Redhat), Ubuntu- Data: Nifi, Storm, Kafka, Zookeeper, ElasticSearch, etc.- Infrastructure: Azure, AWS Elastic Container Service (Docker), and serverless with AWS Lambda- Metrics: Datadog- Continuous Integration: CircleCI- IaaS: AWS- Version Control: Git**_Responsibilities_**- Understand the architecture of our Cloud and Data Systems and the role of each component.- Help with Data ingestion flows (Nifi)- Integration engineering (CircleCI/Jenkins/Teamcity)- Application scaling and tuning using Python.- Software and systems configuration management- Write custom tools, infrastructure, monitoring, and automation in python- Create and promote a good company culture**_Experience/Qualifications_**- 2-5 years of experience with Python software development- Bachelor's in Computer Science, Data Science, or related fields- Strong experience in processing data and drawing insights from large data sets- Good familiarity with one or more of these libraries or similar: pandas, NumPy, SciPy, etc.- Experience with Python development environments, such as but not limited to Jupyter, Google Colab notebooks, Matplotlib, Plotly, and geoplotlib.- Experience with microservices when utilizing datasets.**_Nice to have: _**- Any experience or interest in creating and using advanced machine learning algorithms and statistics: regression, simulation, scenario analysis, modeling, clustering, decision trees, neural networks, etc.- Knowledge of spaCy and similar NLP libraries like NLTK, textacy etc.Tipo de puesto: Tiempo completo, Por tiempo indeterminadoSueldo: $75,000.00 - $90,000