Key accountabilities
- Assemble large, complex data sets to meet functional / non-functional requirements to best big data practices
- Source data from internal and external data sources, engaging with technical subject matter experts
- Explore, analyse, and profile data from various internal and external data sources, and assist data scientist in preparing data for analytical purposes
- Ensure delivered solutions meet Systems Integration and User Acceptance Testing criteria.
- Productionalise solutions and ensure daily data refresh processes run successfully.
Additional requirements:
- Experience in Data Ingestion into Dell ECS from external sources/system
- Create pipelines for data ingestions
- Data refinement prior to ingestion
- Maintenance of data lake
- Automate and Create a files from Dell ECS into the downstream applications based on requirements
- Database scripting in pyspark, python, Hadoop Query Language and unix scripting
Tech stack
- Hadoop/Dell ECS,
- SQL,
- Pyspark,
- python,
- Unix,
- HQL
Education
Bachelor`s Degrees and Advanced Diplomas: Manufacturing, Engineering and Technology




