Hadoop Developer

Charlotte, NC

Company Name :IBA Infotech LLC

Type : Contract

Primary Skills : Java, Python, Scala

Location : Charlotte

CTC : DOE

Job Description:

As a Hadoop Developer, the ideal candidate for this developer role should be able to:

  • Must have real hands-on Scala/Spark experience and able to answer experiential/scenario-based questions.
  • Build high performing data models on big-data architecture as data services.
  • Build a high performing and scalable data pipeline platform using Hadoop, Apache Spark.
  • Partner with Enterprise data teams such as Data Management & Insights and Enterprise Data Environment (Data Lake) and identify the best place to source the data
  • Work with business analysts, development teams and project managers for requirements and business rules.
  • Collaborate with the source system and approved provisioning point (APP) teams, Architects, Data Analysts and Modelers to build scalable and performant data solutions.
  • Effectively work in a hybrid environment where legacy ETL and Data Warehouse applications and new big-data applications co-exist
  • Work with Infrastructure Engineers and System Administrators as appropriate in designing the big-data infrastructure.
  • Work with DBAs in Enterprise Database Management group to troubleshoot problems and optimize performance
  • Support ongoing data management efforts for Development, QA and Production environments
  • Utilizes a thorough understanding of available technology, tools, and existing designs.
  • Leverage knowledge of industry trends to build best in class technology to provide a competitive advantage.
  • Provide QA support & do Testing as needed.

 

Required Qualifications

  • ETL (Extract, Transform, Load) Programming experience
  • Experience in Hadoop ecosystem tools for real-time batch data ingestion, processing and provisioning such as Apache Flume, Apache Sqoop, Apache Spark
  • Java/Scala/Python experience
  • Agile experience

 

Good SQL knowledge

  • Design and development experience with columnar databases using Parquet or ORC file formats on Hadoop
  • Apache Spark design and development experience using Scala, Java, Python or Data Frames with Resilient Distributed Datasets (RDDs)

 

Desired Qualifications

  • Excellent verbal, written, and interpersonal communication skills
  • Ability to work effectively in a virtual environment where key team members and partners are in various time zones and locations
  • Knowledge and understanding of project management methodologies: used in waterfall or Agile development projects
  • Knowledge and understanding of DevOps principles
  • Reporting experience, analytics experience or a combination of both
  • A BS/BA degree or higher in information technology