Notice: Function _load_textdomain_just_in_time was called incorrectly. Translation loading for the redux-framework domain was triggered too early. This is usually an indicator for some code in the plugin or theme running too early. Translations should be loaded at the init action or later. Please see Debugging in WordPress for more information. (This message was added in version 6.7.0.) in /home2/schooli5/public_html/project/wp-includes/functions.php on line 6170

Notice: Function _load_textdomain_just_in_time was called incorrectly. Translation loading for the wp-plugin-bluehost domain was triggered too early. This is usually an indicator for some code in the plugin or theme running too early. Translations should be loaded at the init action or later. Please see Debugging in WordPress for more information. (This message was added in version 6.7.0.) in /home2/schooli5/public_html/project/wp-includes/functions.php on line 6170

Notice: Function _load_textdomain_just_in_time was called incorrectly. Translation loading for the learnpress domain was triggered too early. This is usually an indicator for some code in the plugin or theme running too early. Translations should be loaded at the init action or later. Please see Debugging in WordPress for more information. (This message was added in version 6.7.0.) in /home2/schooli5/public_html/project/wp-includes/functions.php on line 6170

Notice: Function _load_textdomain_just_in_time was called incorrectly. Translation loading for the learnpress domain was triggered too early. This is usually an indicator for some code in the plugin or theme running too early. Translations should be loaded at the init action or later. Please see Debugging in WordPress for more information. (This message was added in version 6.7.0.) in /home2/schooli5/public_html/project/wp-includes/functions.php on line 6170

Deprecated: Creation of dynamic property UjiCountdown::$valscript is deprecated in /home2/schooli5/public_html/project/wp-content/plugins/uji-countdown/classes/class-uji-countdown-front.php on line 56

Notice: Function _load_textdomain_just_in_time was called incorrectly. Translation loading for the insert-headers-and-footers domain was triggered too early. This is usually an indicator for some code in the plugin or theme running too early. Translations should be loaded at the init action or later. Please see Debugging in WordPress for more information. (This message was added in version 6.7.0.) in /home2/schooli5/public_html/project/wp-includes/functions.php on line 6170

Notice: Function _load_textdomain_just_in_time was called incorrectly. Translation loading for the ht-easy-ga4 domain was triggered too early. This is usually an indicator for some code in the plugin or theme running too early. Translations should be loaded at the init action or later. Please see Debugging in WordPress for more information. (This message was added in version 6.7.0.) in /home2/schooli5/public_html/project/wp-includes/functions.php on line 6170

Deprecated: Creation of dynamic property Sinatra::$options is deprecated in /home2/schooli5/public_html/project/wp-content/themes/sinatra/functions.php on line 140

Deprecated: Creation of dynamic property Sinatra::$fonts is deprecated in /home2/schooli5/public_html/project/wp-content/themes/sinatra/functions.php on line 141

Deprecated: Creation of dynamic property Sinatra::$icons is deprecated in /home2/schooli5/public_html/project/wp-content/themes/sinatra/functions.php on line 142

Deprecated: Creation of dynamic property Sinatra::$customizer is deprecated in /home2/schooli5/public_html/project/wp-content/themes/sinatra/functions.php on line 143

Warning: session_start(): Session cannot be started after headers have already been sent in /home2/schooli5/public_html/project/wp-content/plugins/unyson/framework/includes/hooks.php on line 259

Warning: Cannot modify header information - headers already sent by (output started at /home2/schooli5/public_html/project/wp-includes/functions.php:6170) in /home2/schooli5/public_html/project/wp-content/plugins/all-in-one-seo-pack/app/Common/Meta/Robots.php on line 89

Warning: Cannot modify header information - headers already sent by (output started at /home2/schooli5/public_html/project/wp-includes/functions.php:6170) in /home2/schooli5/public_html/project/wp-includes/feed-rss2.php on line 8
Data Engineering - Big Data Trunk https://project.bigdatatrunk.com Quality Corporate and Classroom Training in Bay Area CA Mon, 07 Apr 2025 15:42:34 +0000 en-US hourly 1 https://wordpress.org/?v=7.0 Byte Sized Series – AI First Mindset https://project.bigdatatrunk.com/courses/byte-sized-series-ai-first-mindset/ https://project.bigdatatrunk.com/courses/byte-sized-series-ai-first-mindset/#respond Mon, 07 Apr 2025 08:03:03 +0000 https://www.bigdatatrunk.com/?post_type=lp_course&p=58151 This introductory session offers professionals a clear, engaging entry point into the world of Generative AI - what it is, how it has evolved, and why it's transforming today’s workplaces.

The post Byte Sized Series – AI First Mindset first appeared on Big Data Trunk.

]]>

Deprecated: Creation of dynamic property OMAPI_Elementor_Widget::$base is deprecated in /home2/schooli5/public_html/project/wp-content/plugins/optinmonster/OMAPI/Elementor/Widget.php on line 41
  • Overview
  • Prerequisites
  • Audience
  • Curriculum
Description:

This introductory session offers professionals a clear, engaging entry point into the world of Generative AI - what it is, how it has evolved, and why it's transforming today’s workplaces. Through real-world use cases and guided discussions, participants will explore GenAI’s strengths and challenges, while uncovering how an ‘AI-First Mindset’ fosters creativity, accelerates tasks, and enhances collaborative decision-making across functions. No technical experience is needed — just curiosity and a willingness to rethink how we approach work.

Duration: 90 min

Course Code: BDT473

Learning Objectives:

After completing this course, participants will be able to:

  • Understand the core concepts of Generative AI and its evolution from traditional AI.
  • Recognize key strengths, risks, and current limitations of GenAI tools.
  • Identify practical opportunities to adopt GenAI tools in their professional roles.
  • No technical background required
  • A general curiosity about AI and openness to new ways of working
  • Business professionals looking to understand how AI can enhance productivity
  • Team leads and managers exploring AI integration in workflows
  • Non-technical professionals interested in AI’s impact on the workplace
  • Tech-savvy individuals seeking a strategic perspective before diving into tools

Course Outline:

  • Introduction to Generative AI and Its Evolution
  • Key milestones in AI development
  • Real-world examples of GenAI tools (text, image, audio, code generation)
  • Understanding the AI-First Mindset
  • What it means to think “AI-First” (mindset vs. toolset)
  • How companies are reimagining workflows using GenAI
  • Shifting perspectives: AI as a co-pilot, not a replacement
  • GenAI Strengths and Challenges
  • Human + AI: Practical Use Cases in the Workplace
  • Cross-functional examples: Marketing, HR, legal, sales, data analysis
  • Live demo and  Q&A

The post Byte Sized Series – AI First Mindset first appeared on Big Data Trunk.

]]>
https://project.bigdatatrunk.com/courses/byte-sized-series-ai-first-mindset/feed/ 0
Generative AI for UX Designers https://project.bigdatatrunk.com/courses/generative-ai-for-ux-designers/ https://project.bigdatatrunk.com/courses/generative-ai-for-ux-designers/#respond Thu, 03 Apr 2025 07:16:26 +0000 https://www.bigdatatrunk.com/?post_type=lp_course&p=58021 This 1-day course teaches UX designers how to use Generative AI (Gen AI) to enhance their workflows, speed up design tasks, and improve user experiences.

The post Generative AI for UX Designers first appeared on Big Data Trunk.

]]>
  • Overview
  • Prerequisites
  • Audience
  • Curriculum
Description:

This 1-day course teaches UX designers how to use Generative AI (Gen AI) to enhance their workflows, speed up design tasks, and improve user experiences. Participants will learn to integrate AI into various stages of the UX design process—from ideation and prototyping to user research and usability testing. The course will focus on practical AI tools that directly impact design quality, user insights, and creativity.

Duration: 1 Day

Course Code: BDT424

Learning Objectives:

By the end of the training, participants will be able to:

  • Understand how Generative AI can improve UX design:
  • Speed up the ideation phase with AI
  • Create efficient prototypes with AI assistance
  • Enhance user research with AI tools.
  • Improve usability testing with AI
  • Personalize user experiences with AI
  • Basic understanding of UX design principles.
  • Familiarity with tools like Figma, Sketch, or Adobe XD.
  • Interest in incorporating AI tools into the design process.
  • The target audience for this course includes UX designers, UI designers, product designers, design researchers, and UX researchers.

Course Outline:

1. Introduction to Generative AI in UX Design

  • What is Generative AI?
    • Brief overview of AI, machine learning, and generative models, with an emphasis on their relevance for UX design.
  • Why Should UX Designers Embrace AI?
    • How AI enhances creativity, workflow, and decision-making within the UX design process.
    • Key areas where AI can be most effective: ideation, prototyping, research, testing, and user experience personalization.

2. AI for Ideation and Concept Design

  • Leveraging AI for Design Inspiration
    • Using tools like DALL·E, MidJourney, and Runway to generate design ideas, user interface concepts, and creative assets.
  • AI-Driven Design Recommendations
    • How AI can assist in generating color schemes, layouts, typography, and other visual components based on best practices.

3. AI-Powered Prototyping and Wireframing

  • AI-Assisted Prototype Generation
    • Tools like Uizard and Figma AI for auto-generating wireframes and responsive designs.
    • How AI can speed up the prototyping process by suggesting UI elements and layouts based on user needs.
  • AI for Layouts and Component Variations
    • Using AI to quickly iterate on design elements like buttons, navigation menus, and page layouts

4. AI in User Research and Data Analysis

  • AI-Driven User Research Insights
    • How AI tools (e.g., Lookback, Hotjar) help collect, analyze, and interpret user behavior data.
    • Using AI to identify trends, patterns, and common pain points in user feedback and analytics.
  • AI for Sentiment Analysis
    • Leveraging AI to analyze open-ended responses from user interviews, surveys, and feedback sessions.

5. AI for Usability Testing and Feedback Analysis

  • AI-Enhanced Usability Testing
    • How AI tools can simulate user interactions and provide predictive insights about potential usability issues.
    • Tools like UsabilityHub and PlaybookUX for AI-assisted usability testing and heatmap generation.
  • Using AI to Optimize User Interfaces
    • Analyzing session recordings, heatmaps, and interaction data to identify areas for design improvement.

6. AI for Personalizing User Experiences

  • Creating Dynamic, Personalized User Interfaces
    • How AI can help deliver personalized experiences based on user behavior, preferences, and historical data.
    • Implementing personalized content, adaptive layouts, and dynamic UI elements with AI tools.

Training material provided: Yes (Digital format)

The post Generative AI for UX Designers first appeared on Big Data Trunk.

]]>
https://project.bigdatatrunk.com/courses/generative-ai-for-ux-designers/feed/ 0
Data Science for Finance https://project.bigdatatrunk.com/courses/data-science-for-finance/ https://project.bigdatatrunk.com/courses/data-science-for-finance/#respond Tue, 01 Apr 2025 12:39:41 +0000 https://www.bigdatatrunk.com/?post_type=lp_course&p=57943 This 3-day hands-on training bridges the gap between financial domain expertise and data science techniques.

The post Data Science for Finance first appeared on Big Data Trunk.

]]>
  • Overview
  • Prerequisites
  • Audience
  • Curriculum
Description:

This 3-day hands-on training bridges the gap between financial domain expertise and data science techniques. Participants will explore the foundational concepts of data science and their application in finance, including risk modeling, credit scoring, fraud detection, and algorithmic trading. Through a mix of real-world datasets and use cases, learners will use Python and free libraries such as pandas, scikit-learn, and TensorFlow to implement machine learning and deep learning models. Each module includes both conceptual understanding and practical implementation, empowering participants to apply data-driven insights to real-world financial challenges.

Duration: 3 Days

Course Code: BDT48

Learning Objectives:

After this training, participants will be able to:

  1. Describe the role of data science in solving financial problems.
  2. Apply data preprocessing and feature engineering techniques on financial datasets
  3. Implement machine learning models for classification, regression, and anomaly detection.
  4. Analyze model performance using appropriate metrics and improve predictive accuracy.
  5. Develop and evaluate deep learning models for financial forecasting and risk modeling.
  • Basic knowledge of finance and statistics
  • Familiarity with Python programming
  • No prior machine learning experience required
  • Finance professionals and analysts exploring data science
  • Aspiring data scientists seeking finance-specific applications
  • Business analysts looking to apply machine learning in financial decision-making

Course Outline:

Module 1: Introduction to Data Science in Finance

  • Overview of data science lifecycle
  • Key challenges in financial data analysis
  • Exploratory data analysis using pandas and matplotlib
  • Financial datasets: market data, transactions, credit data
  • Hands-on: Cleaning and visualizing financial time-series and tabular data

 

Module 2: Machine Learning Applications in Finance

  • Supervised vs unsupervised learning in finance
  • Classification use cases: credit scoring, fraud detection
  • Regression use cases: stock price prediction, financial forecasting
  • Clustering use cases: customer segmentation and profiling
  • Hands-on: Building and evaluating models using scikit-learn
  • Hands-on: ROC curves, confusion matrix, R-squared, and error metrics

 

Module 3: Deep Learning and Advanced Use Cases

  • Introduction to neural networks for financial data
  • Time-series modeling with LSTM for forecasting
  • Anomaly detection using autoencoders
  • Use case: Loan default prediction using TensorFlow/Keras
  • Hands-on: Deep learning model implementation and evaluation
  • Best practices for model deployment and explainability (LIME, SHAP)

 

Training material provided: Yes (Digital format)

The post Data Science for Finance first appeared on Big Data Trunk.

]]>
https://project.bigdatatrunk.com/courses/data-science-for-finance/feed/ 0
Data Engineering and Analytics on GCP (Google Cloud Platform) https://project.bigdatatrunk.com/courses/data-engineering-and-analytics-on-gcp-google-cloud-platform/ https://project.bigdatatrunk.com/courses/data-engineering-and-analytics-on-gcp-google-cloud-platform/#respond Tue, 26 Nov 2024 13:26:23 +0000 https://www.bigdatatrunk.com/?post_type=lp_course&p=54798 This training provides an in-depth introduction to data engineering and analytics on Google Cloud Platform (GCP). Participants will explore key GCP services such as BigQuery, Dataflow, and Cloud Storage while learning to build scalable data pipelines and analyze datasets effectively.

The post Data Engineering and Analytics on GCP (Google Cloud Platform) first appeared on Big Data Trunk.

]]>
  • Overview
  • Prerequisites
  • Audience
  • Curriculum
Description:

This training provides an in-depth introduction to data engineering and analytics on Google Cloud Platform (GCP). Participants will explore key GCP services such as BigQuery, Dataflow, and Cloud Storage while learning to build scalable data pipelines and analyze datasets effectively. The training focuses on hands-on application of GCP tools to address real-world data challenges. By the end of the day, attendees will be equipped to design and implement efficient data workflows and analytics solutions on GCP.

Duration: 1 Day

Course Code: BDT34

Learning Objectives:

By the end of this training, participants will be able to:

  • Identify the core data engineering and analytics tools on GCP.
  • Build data pipelines using Cloud Storage, Dataflow, and Pub/Sub.
  • Analyze large datasets with BigQuery.
  • Design workflows to integrate real-time and batch processing.
  • Optimize data solutions for cost and performance on GCP.
  • Basic familiarity with data concepts and cloud computing is recommended. Knowledge of SQL is helpful but not required.
  • Data engineers and analysts exploring GCP for data solutions.
  • IT professionals interested in building scalable data workflows on GCP.
  • Business leaders seeking to understand GCP analytics capabilities.
Course Outline:

Module 1: Introduction to GCP for Data Engineering and Analytics

  • Overview of GCP’s Data Ecosystem
  • Key Services: BigQuery, Dataflow, Cloud Storage, and Pub/Sub

Module 2: Data Storage and ETL Pipelines on GCP

  • Storing and Managing Data with Cloud Storage
  • Creating ETL Pipelines with Dataflow
  • Hands-On: Building a Data Pipeline

Module 3: Analytics with BigQuery

  • Introduction to BigQuery: Architecture and Features
  • Querying and Analyzing Datasets
  • Hands-On: Writing and Executing BigQuery SQL Queries

Module 4: Real-Time Data Processing with Pub/Sub

  • Introduction to Pub/Sub for Streaming Data
  • Designing Real-Time Data Workflows
  • Hands-On: Processing Streaming Data

Module 5: Use Cases, Best Practices, and Wrap-Up

  • Real-World Applications of GCP in Data Engineering
  • Best Practices for Performance and Cost Optimization
  • Q&A and Additional Resources

Training material provided: Yes (Digital format)

The post Data Engineering and Analytics on GCP (Google Cloud Platform) first appeared on Big Data Trunk.

]]>
https://project.bigdatatrunk.com/courses/data-engineering-and-analytics-on-gcp-google-cloud-platform/feed/ 0
Data Engineering and Analytics on AWS (Amazon Web Services) https://project.bigdatatrunk.com/courses/data-engineering-and-analytics-on-aws-amazon-web-services/ https://project.bigdatatrunk.com/courses/data-engineering-and-analytics-on-aws-amazon-web-services/#respond Tue, 26 Nov 2024 13:19:40 +0000 https://www.bigdatatrunk.com/?post_type=lp_course&p=54793 This training provides a hands-on introduction to data engineering and analytics capabilities on AWS. Participants will learn how to build scalable data pipelines, process and analyze data, and use key AWS services such as AWS Glue, Redshift, and Athena.

The post Data Engineering and Analytics on AWS (Amazon Web Services) first appeared on Big Data Trunk.

]]>
  • Overview
  • Prerequisites
  • Audience
  • Curriculum
Description:

This training provides a hands-on introduction to data engineering and analytics capabilities on AWS. Participants will learn how to build scalable data pipelines, process and analyze data, and use key AWS services such as AWS Glue, Redshift, and Athena. The training emphasizes practical applications of AWS tools to manage and analyze large datasets efficiently. By the end of the session, attendees will have the foundational skills to design and implement data workflows and analytics solutions on AWS.

Duration: 1 Day

Course Code: BDT33

Learning Objectives:

By the end of this training, participants will be able to:

  • Identify the key data engineering and analytics services on AWS.
  • Build data pipelines using AWS Glue and S3.
  • Analyze large datasets using Redshift and Athena.
  • Integrate real-time and batch processing workflows.
  • Evaluate AWS-based solutions for analytics in business scenarios.
  • Basic knowledge of cloud computing and data concepts is recommended. Familiarity with SQL is beneficial but not mandatory.
  • Data engineers and analysts exploring AWS for data solutions.
  • IT professionals seeking to implement data pipelines and analytics workflows.
  • Business managers interested in AWS-based analytics solutions.
Course Outline:

Module 1: Introduction to AWS Data Engineering and Analytics

  • Overview of Data Engineering and Analytics Concepts
  • AWS Data Ecosystem: S3, Glue, Redshift, Athena, Kinesis

Module 2: Data Storage and ETL Pipelines with AWS Glue

  • Introduction to AWS Glue for Data Integration
  • Building ETL Pipelines and Cataloging Data
  • Hands-On: Creating an ETL Workflow

Module 3: Analytics with Redshift and Athena

  • Overview of Amazon Redshift for Data Warehousing
  • Serverless Analytics with Amazon Athena
  • Hands-On: Querying and Analyzing Data

Module 4: Real-Time Data Processing with Amazon Kinesis

  • Introduction to Streaming Data Processing
  • Designing Real-Time Workflows with Kinesis Data Streams

Module 5: Real-World Use Cases and Best Practices

  • Applications of Data Engineering on AWS
  • Best Practices for Scalability and Cost Optimization
  • Q&A and Additional Resources

Training material provided: Yes (Digital format)

The post Data Engineering and Analytics on AWS (Amazon Web Services) first appeared on Big Data Trunk.

]]>
https://project.bigdatatrunk.com/courses/data-engineering-and-analytics-on-aws-amazon-web-services/feed/ 0
Data Engineering and Analytics on Microsoft Cloud – Azure https://project.bigdatatrunk.com/courses/data-engineering-and-analytics-on-microsoft-cloud-azure/ https://project.bigdatatrunk.com/courses/data-engineering-and-analytics-on-microsoft-cloud-azure/#respond Tue, 26 Nov 2024 13:13:59 +0000 https://www.bigdatatrunk.com/?post_type=lp_course&p=54788 This training focuses on the powerful data engineering and analytics capabilities provided by Microsoft Azure. Participants will learn how to build robust data pipelines, process large datasets, and perform analytics using Azure services.

The post Data Engineering and Analytics on Microsoft Cloud – Azure first appeared on Big Data Trunk.

]]>
  • Overview
  • Prerequisites
  • Audience
  • Curriculum
Description:

This training focuses on the powerful data engineering and analytics capabilities provided by Microsoft Azure. Participants will learn how to build robust data pipelines, process large datasets, and perform analytics using Azure services. The training includes an introduction to key Azure tools like Data Factory, Synapse Analytics, and Databricks, complemented by hands-on exercises to apply concepts in real-world scenarios. By the end of the day, participants will have the confidence to implement scalable data engineering and analytics workflows on Azure.

Duration: 1 Day

Course Code: BDT32

Learning Objectives:

By the end of this training, participants will be able to:

  • Describe the data engineering and analytics services available on Azure.
  • Build data pipelines using Azure Data Factory.
  • Process and analyze data with Azure Synapse Analytics and Azure Databricks.
  • Design scalable workflows for ETL and data integration.
  • Evaluate use cases for applying Azure solutions in analytics.
  • Basic understanding of data concepts, including ETL and analytics, is recommended. Familiarity with cloud platforms is helpful but not required.
  • Data engineers and analysts exploring Azure solutions.
  • IT professionals seeking to integrate data workflows on Azure.
  • Business professionals interested in leveraging data analytics on the cloud.
Course Outline:

Module 1: Introduction to Data Engineering and Analytics on Azure

  • Overview of Data Engineering and Analytics Concepts
  • Introduction to Azure’s Data Ecosystem
  • Key Services: Azure Data Factory, Synapse Analytics, Databricks

Module 2: Building Data Pipelines with Azure Data Factory

  • Introduction to Azure Data Factory (ADF)
  • Data Integration and ETL Workflow Design
  • Hands-On: Creating and Managing Pipelines

Module 3: Processing and Analyzing Data with Azure Synapse Analytics

  • Overview of Azure Synapse: Features and Architecture
  • Performing Analytics with SQL and Serverless Pools
  • Hands-On: Analyzing Data in Synapse

Module 4: Advanced Data Processing with Azure Databricks

  • Introduction to Azure Databricks and Apache Spark Integration
  • Processing Large Datasets in Real-Time
  • Hands-On: Implementing Analytics with Databricks

Module 5: Real-World Use Cases and Wrap-Up

  • Real-World Applications of Data Engineering on Azure
  • Best Practices for Performance and Cost Optimization
  • Q&A and Additional Resources

Training material provided: Yes (Digital format)

The post Data Engineering and Analytics on Microsoft Cloud – Azure first appeared on Big Data Trunk.

]]>
https://project.bigdatatrunk.com/courses/data-engineering-and-analytics-on-microsoft-cloud-azure/feed/ 0
Kickstart DBT for Snowflake in a Day https://project.bigdatatrunk.com/courses/kickstart-dbt-for-snowflake-in-a-day/ https://project.bigdatatrunk.com/courses/kickstart-dbt-for-snowflake-in-a-day/#respond Thu, 07 Sep 2023 06:34:07 +0000 https://www.bigdatatrunk.com/?post_type=lp_course&p=43402 In this one-day course, students will dive into core concepts of data build tool (DBT) and learn how to streamline data engineering pipelines for Snowflake.

The post Kickstart DBT for Snowflake in a Day first appeared on Big Data Trunk.

]]>
  • Overview
  • Prerequisites
  • Audience
  • Curriculum

Description:

In this one-day course, students will dive into core concepts of data build tool (DBT) and learn how to streamline data engineering pipelines for Snowflake. From understanding models and materialization to exploring source freshness and using advanced techniques like macros and hooks, students will gain a solid foundation of using data build tools (DBT) effectively. Students will get practical hands-on experience using the data build tool with Snowflake.

Duration: 1 Day

Course Code: BDT302

Learning Objectives:

After this course, you will be able to:

  • Introduction Data Build Tool (DBT)
  • Understanding DBT models
  • Using DBT tests to ensure quality of DBT models
  • Explore DBT materializations to optimize performance & scalability of DBT models
  • Integrating seeds and sources in DBT project during data ingestion
  • Dive into DBT Hooks to integrate external scripts and actions into your DBT workflows

Basic understanding of Snowflake and SQL

This course is designed for Analytics Engineers, Data Analysts, BI Professionals, Data Scientists, Data Engineers, DevOps Engineers, and Architects

1. Introduction to Data Build Tool (DBT)

  • Introduction to Data Warehouse (Snowflake)
  • ETL v/s ELT
  • DBT introduction
  • DBT installation
  • DBT cloud introduction
  • Lab: Getting started with DBT

2. Understanding DBT Models

  • What are DBT models?
  • Creating DBT table
  • Using DBT schema
  • DBT project organization
  • Lab: Project organization

3. Using DBT Tests

  • What is DBT schema?
  • What is a DBT Macro?
  • Understanding DBT test types
  • Lab: Generic and Singular Tests

4. Exploring DBT Materialization

  • What are materializations in DBT?
  • Default materialization in DBT
  • Using Config Block for materialization
  • Lab: Setting materialization

5. Integrating DBT Seeds and Sources

  • Seeds and Sources overview
  • Adding sources in DBT
  • What is source freshness?
  • Labs: Adding source freshness check in DBT

6. DBT Hooks

  • What are DBT Hooks?
  • Understanding pre-hook, post-hook, on-run-start, on-run-end hooks
  • Implementing DBT Hook

Training material provided: Yes (Digital format)

Hands-on Lab: Instructions will be provided to set up a free tier snowflake account. Also, students will be provided on how to install DBT tool on Windows/Mac

The post Kickstart DBT for Snowflake in a Day first appeared on Big Data Trunk.

]]>
https://project.bigdatatrunk.com/courses/kickstart-dbt-for-snowflake-in-a-day/feed/ 0
Mastering Data Build Tool (DBT) for Snowflake https://project.bigdatatrunk.com/courses/mastering-data-build-tool-dbt-for-snowflake/ https://project.bigdatatrunk.com/courses/mastering-data-build-tool-dbt-for-snowflake/#respond Thu, 07 Sep 2023 05:53:45 +0000 https://www.bigdatatrunk.com/?post_type=lp_course&p=43416 In the data-driven landscape, the ability to efficiently manage, transform, and materialize data is crucial.

The post Mastering Data Build Tool (DBT) for Snowflake first appeared on Big Data Trunk.

]]>
  • Overview
  • Prerequisites
  • Audience
  • Curriculum

Description:

In the data-driven landscape, the ability to efficiently manage, transform, and materialize data is crucial. In this comprehensive hands-on course, students will dive deep into the world of DBT and learn how to leverage its power to build robust data transformation pipelines. From foundational concepts to advanced techniques, students will gain hands-on experience working with key components: models, materialization, seeds, snapshots, source freshness, macro, and hooks. By the end of this course, students will be equipped with skills to build efficient data pipelines using modern data build tools.

Duration: 2 days

Course Code: BDT301

Learning Objectives:

After this course, you will be able to:

  • Introduction Data Build Tool (DBT)
  • Understanding DBT models
  • Using DBT tests to ensure quality of DBT models
  • Explore DBT materializations to optimize performance & scalability of DBT models
  • Integrating seeds and sources in DBT project during data ingestion
  • Enhance data loading capabilities and create custom macros
  • Using DBT snapshots learn how to capture historical versions of data for auditing and analysis process
  • Dive into DBT Hooks to integrate external scripts and actions into your DBT workflows

Basic understanding of Snowflake and SQL

This course is designed for Analytics Engineers, Data Analysts, BI Professionals, Data Scientists, Data Engineers, DevOps Engineers, and Architects

1. Introduction to Data Build Tool (DBT)

  • Introduction to Data Warehouse (Snowflake)
  • ETL v/s ELT
  • DBT introduction
  • DBT installation
  • DBT cloud introduction
  • Lab: Getting started with DBT

2. Understanding DBT Models

  • What are DBT models?
  • Creating DBT table
  • Using DBT schema
  • DBT project organization
  • Lab: Project organization

3. Using DBT Tests

  • What is DBT schema?
  • What is a DBT Macro?
  • Understanding DBT test types
  • Lab: Generic and Singular Tests

4. Exploring DBT Materialization

  • What are materializations in DBT?
  • Default materialization in DBT
  • Using Config Block for materialization
  • Lab: Setting materialization

5. Integrating DBT Seeds and Sources

  • Seeds and Sources overview
  • Adding sources in DBT
  • What is source freshness?
  • Labs: Adding source freshness check in DBT

6. DBT Custom Macros

  • Implementing Table, View and Ephemeral Model
  • Create custom macro
  • Understanding DBT package
  • Labs: Building incremental load

7. Working with DBT Snapshots

  • Snapshots overview
  • Creating a snapshot
  • Labs: create snapshot

8. DBT Hooks

  • What are DBT Hooks?
  • Understanding pre-hook, post-hook, on-run-start, on-run-end hooks
  • Implementing DBT Hook
  • Labs: implementing DBT hook

Training material provided: Yes (Digital format)

Hands-on Lab: Instructions will be provided to set up a free tier snowflake account. Also, students will be provided on how to install DBT tool on Windows/Mac

The post Mastering Data Build Tool (DBT) for Snowflake first appeared on Big Data Trunk.

]]>
https://project.bigdatatrunk.com/courses/mastering-data-build-tool-dbt-for-snowflake/feed/ 0
Kickstart Snowpark with Python https://project.bigdatatrunk.com/courses/kickstart-snowpark-with-python/ https://project.bigdatatrunk.com/courses/kickstart-snowpark-with-python/#respond Thu, 07 Sep 2023 05:42:52 +0000 https://www.bigdatatrunk.com/?post_type=lp_course&p=43391 Snowpark is a new developer experience for Snowflake that allows developers to write code in their preferred language: Scala, Java or Python to supplement the original SQL interface.

The post Kickstart Snowpark with Python first appeared on Big Data Trunk.

]]>
  • Overview
  • Prerequisites
  • Audience
  • Curriculum

Description:

Unlock the full potential of Snowpark, the innovative developer experience for Snowflake. This course equips you with the expertise to leverage your preferred language—Scala, Java, or Python—alongside the SQL interface. Discover how to harness the Snowpark API to create a customized software development environment. Say goodbye to exporting data to external environments and tap into Snowflake's powerful computing capabilities. Dive into reading and writing operations, transformations, queries, and the creation of Python UDFs (user-defined functions) using Snowpark.

Duration: 1 day

Course Code: BDT300

Learning Objectives

By the end of this course, you will:

  • Get Started with Snowpark and Python Integration in Snowflake
  • Leverage Snowpark for -Efficient Structured Data Reading and Writing in Snowflake
  • Master the Art of Handling Semi-Structured Data Using Snowpark
  • Perform Real-Time Data -Transformations While --Loading with Snowpark
  • Seamlessly Integrate Third-Party Python Libraries to Create User-Defined Functions (UDFs) in Snowpark.

Basic knowledge Snowflake and Python.

This course is designed for anyone interested in using the Snowpark API using Python. It is geared towards data engineers, architects, QA engineers, BI professionals, and data analysts who want to use Python to handle data processing in Snowflake.

Course Outline:

1. Introduction to Snowpark and Python Integration

  • Overview of Snowpark and its importance in Snowflake data processing
  • Brief introduction to Python’s role in Snowpark development
  • Setting up Snowpark development environment with Python
  • Hands-on: Execute a basic Python script to use Snowpark API

2. Use Snowpark to read and write structured data in Snowflake

  • Create Snowpark Dataframe
  • Apply schema to Dataframe
  • Read from S3: CSV and JSON
  • Write from S3 to Snowflake table, CSV, JSON
  • Hands-on lab with these topics

3. Handling semi-structured data with Snowpark

  • Create dataframe from S3 JSON files
  • Copy data into snowflake dataframe
  • Create dataframe from parquet files
  • Copy data into S3 parquet files into Snowflake table
  • Handle error records
  • Hands-on labs with these topics

4. Perform transformations while loading

  • Using the Snowpark’ aggregation framework
  • Perform grouping of data
  • Using Window functions
  • Using Join and the “using” clause
  • Hands-on labs with these topics

5. Integration third party Python libraries to create UDF

  • Build generic usable components library in Python
  • Create Snowpark UDF
  • Using vectorized UDFs
  • Integrating external packages
  • Hands-on lab with these topics

Training material provided: Yes (Digital format)

Hands-on Lab: Instructions will be provided to students to create “trial” snowflake account. Instructions will be provided in class to install and use Snowpark

The post Kickstart Snowpark with Python first appeared on Big Data Trunk.

]]>
https://project.bigdatatrunk.com/courses/kickstart-snowpark-with-python/feed/ 0
Getting Started with Apache Spark using Databricks https://project.bigdatatrunk.com/courses/getting-started-with-apache-spark-using-databricks/ https://project.bigdatatrunk.com/courses/getting-started-with-apache-spark-using-databricks/#respond Thu, 07 Sep 2023 05:08:36 +0000 https://www.bigdatatrunk.com/?post_type=lp_course&p=43373 Jumpstart your data journey with our 'Getting Started With Apache Spark Using Databricks' training. This course empowers participants to tackle complex data challenges, harnessing the potential of Apache Hadoop and Apache Spark to uncover valuable insights across various domains.

The post Getting Started with Apache Spark using Databricks first appeared on Big Data Trunk.

]]>
  • Overview
  • Prerequisites
  • Audience
  • Curriculum

Description:

Jumpstart your data journey with our 'Getting Started With Apache Spark Using Databricks' training. This course empowers participants to tackle complex data challenges, harnessing the potential of Apache Hadoop and Apache Spark to uncover valuable insights across various domains.

In today's data-driven world, Big Data has become the driving force behind intelligent enterprise software. Companies worldwide are adopting Big Data solutions to manage the vast and high-velocity data streams efficiently.

For software architects and engineers, this course offers a practical, hands-on experience with a blend of lectures, demonstrations, and interactive labs, ensuring a comprehensive understanding of Big Data and Apache Spark's advanced applications. Start your data transformation journey today.

Duration: 4 Days

Course Code: BDT97

Learning Objectives:

After this course, you will be able to:

  • Have a broad understanding of Big Data Ecosystem.
  • Understand the various offerings like Cloudera, Hortonworks, MapR, Amazon EMR and Microsoft Azure HDInsight in the industry around Big data on cloud and on Premise.
  • Understand the impact and value of Apache Spark in the Big Data Ecosystem.
  • Understand the Apache Spark Architecture and the various libraries to perform various use cases like SQL, Streaming, Machine Learning, Graphix/Graph Frames, etc.
  • Setup Account on Apache Spark Databricks Cloud.
  • Perform hands-on activity on Big Data Ecosystem.
  • Experience of programming language like Python required.
  • SQL and Data knowledge
  • Familiarity with Big data is a plus

This course is designed for Data Analysts, Software Engineers, Data Engineer, Data Professional, Business Intelligence Developer, Data Architect, DevOps Engineer

Course Outline

Day 1: -

Big Data overview

  • A brief history of Big Data
  • History and background of Big Data and Hadoop
  • 5 V’s of Big Data
  • Secret Sauce of Big Data Hadoop
  • Big Data Distributions in Industry
  • End-to-End Big Data Life cycle overview
  • Industry Use cases

Big Data Ecosystem before Spark

  • Big Data Ecosystem before Apache Spark
  • Storage options – HDFS and No-SQL
  • Processing options – MapReduce, Hive etc.
  • Administrative tools – Zookeeper, Ozzie etc.
  • Ingestion tools – Sqoop, Flume

Big Data Ecosystem after Spark

  • Big Data Ecosystem after Apache Spark
  • Compare MapReduce Vs Apache Spark
  • Apache Spark Architecture
  • Understand Apache Architecture and Libraries like Streaming, Machine Learning with Spark ML, GraphX/GraphFrames, etc.
  • Understanding Spark RDD
  • Setup Account on Apache Spark Databricks Cloud.
  • Introduction to Notebooks concept on Databricks
  • Demos and Labs

Days 2: -

Getting Started with Apache Spark

  • Introduction to Spark RDD
  • Spark RDD Transformation and Actions
  • Spark Lifecycle
  • Spark Caching
  • Lab - Spark RDD Transformation & Actions
  • Lab - Spark RDD Advanced Transformation & Actions
  • Demos and Labs

Apache Spark SQL, DataFrames, Datasets

  • Introduction to Spark SQL
  • SQL, DataFrames and Datasets Spark Library
  • Compare the various APIs - RDD, DataFrames and Datasets
  • Lab - Spark DataFrames Transformation & Actions
  • Lab - Spark DataFrames Advanced Transformation & Actions
  • Demos and Labs

Days3: -

Data Science Overview

  • Data Science Process Overview
  • Structured and Unstructured Data
  • Data Acquisition and Transformation
  • Data Analysis and Machine Learning
  • Machine Learning Concepts

Machine Learning Overview using Apache Spark

  • Introduction to Machine Learning and Data Science
  • Machine Learning Spark Library
  • Spark Machine Learning – Classification, Regression
  • Machine Learning Model building with Spark ML Library
  • Demos and Labs

Days4: -

Structured Streaming Overview using Apache Spark

  • Need of real time processing
  • Streaming Spark Library
  • Streaming Query
  • Processing and Aggregating Streams
  • Data Lake concept
  • Spark Streaming examples
  • Demos and Labs

Graphix/Graph Frames Overview using Apache Spark

  • Need of Graphix/Graph Frames
  • Spark Graphx & GraphFrames Library
  • Spark Graphx & GraphFrames examples
  • Demos and Labs

Training material provided: Yes (Digital format)

The post Getting Started with Apache Spark using Databricks first appeared on Big Data Trunk.

]]>
https://project.bigdatatrunk.com/courses/getting-started-with-apache-spark-using-databricks/feed/ 0