Data Modeling, Data Warehouse, ETL, LoadGSPANN is seeking a highly skilled Data Architect to join our team in Mexico. The ideal candidate will lead and define data architecture, ensure data quality, and establish data governance processes. This role involves handling millions of rows of data daily, solving significant big data challenges, and building top-tier data solutions that drive key business decisions.Design, implement, and lead data architecture, ensuring high standards of data quality and governance across the organization.Establish and promote data modeling standards and best practices.Develop and advocate for data quality standards and practices.Create and maintain data governance processes, procedures, policies, and guidelines to ensure data integrity and security.Promote the successful adoption of data utilization and self-service data platforms within the organization.Create and maintain critical data standards and metadata to enable data as a shared asset.Develop standards and write template codes for sourcing, collecting, and transforming data for both streaming and batch processing.Design data schemes, object models, and flow diagrams to structure, store, process, and integrate data.Provide architectural assessments, strategies, and roadmaps for data management.Implement and manage industry best practice tools and processes, including Data Lake, Databricks, Delta Lake, S3, Spark ETL, Airflow, Hive Catalog, Redshift, Kafka, Kubernetes, Docker, and CI/CD pipelines.Translate big data and analytics requirements into scalable and high-performance data models, guiding data analytics engineers.Define templates and processes for designing and analyzing data models, data flows, and integration.Lead and mentor Data Analytics team members in best practices, processes, and technologies in Data Platforms.Skills and ExperienceBachelor's or Master's degree in Computer Science or a related field.Over 10 years of hands-on experience in Data Warehousing, ETL processes, Data Modeling, and Reporting.More than 7 years of experience in productizing and deploying Big Data platforms and applications.Expertise in relational/SQL, distributed columnar data stores/NoSQL databases, time-series databases, Spark streaming, Kafka, Hive, Delta Parquet, Avro, and more.Hands-on subject-matter expertise in the architecture and administration of Big Data platforms and Data Lake Technologies (AWS S3/Hive), and experience with ML and Data Science platforms.Extensive experience in understanding complex business use cases and modeling data in the data warehouse.Proficiency in SQL, Python, Spark, AWS S3, Hive data catalog, Parquet, Redshift, Airflow, and Tableau or similar tools.Proven experience in building Custom Enterprise Data Warehouses or implementing tools like Data Catalogs, Spark, Tableau, Kubernetes, and Docker.Good knowledge of infrastructure requirements such as Networking, Storage, and Hardware Optimization, with hands-on experience with Amazon Web Services (AWS).Strong verbal and written communication skills, with the ability to work efficiently across internal and external organizations and virtual teams.Demonstrated industry leadership in Data Warehousing, Data Science, and Big Data technologies.Strong understanding of distributed systems and container-based development using Docker and the Kubernetes ecosystem.Deep knowledge of data structures and algorithms.Experience working in large teams using CI/CD and agile methodologies.
#J-18808-Ljbffr