Job description
- Need a strong Scala/PySpark programmer. Python experience also.
- Expert SQL background
- 8+ years of experience in data analysis, data modelling and implementation of enterprise class systems spanning Big Data, Data Integration, Object Oriented programming and Advanced Analytics
- Excellent understanding of Hadoop architecture and different demons of Hadoop clusters which include Job Tracker, Task Tracker, Name Node and Data Node
- Good understanding of Data Mining and techniques
- Experience in importing and exporting data from RDBMS to HDFS, Hive tables and HBase by using Sqoop
- Experience in importing streaming data into HDFS using flume sources, and flume sinks and transforming the data using flume interceptors
- Exposure on usage of Apache Kafka develop data pipeline of logs as a stream of messages using producers and consumers
- Knowledge of HBase and Json.