# πŸ”Ή SECTION A – Dataset Understanding ## βœ… Step 1: Load the Dataset 1. Open **Power BI Desktop** 2. Click **Home β†’ Get Data β†’ Text/CSV** 3. Select `netflix_titles.csv` 4. Click **Load** --- ## 1️⃣ How many rows and columns? ### Steps: 1. Go to **Data View** 2. Bottom left corner shows total rows. 3. Go to **Transform Data β†’ Power Query** 4. It shows number of columns. Or: πŸ‘‰ Transform Data β†’ View β†’ Column distribution --- ## 2️⃣ Which columns are categorical? Categorical columns: * type * title * director * cast * country * rating * listed_in These are text-based columns. --- ## 3️⃣ Which column contains date information? πŸ‘‰ `date_added` ### Steps to check: 1. Select column 2. Go to **Column Tools** 3. Check Data Type = Date --- ## 4️⃣ Difference between release_year and date_added * `release_year` β†’ Year content was originally released * `date_added` β†’ When Netflix added it to the platform --- ## 5️⃣ Why blank values? Reasons: * Missing information * Not applicable (e.g., some TV shows may not have directors) * Data entry issues --- # πŸ”Ή SECTION B – Data Cleaning & Preparation ## βœ… Step 2: Open Power Query Home β†’ Transform Data --- ## 6️⃣ Four Data Cleaning Steps βœ” Remove duplicates βœ” Change data types βœ” Handle missing values βœ” Split multi-value columns (like listed_in) --- ## 7️⃣ Handle Missing Director Values ### Option 1 (Recommended in exam): Replace blanks with "Not Available" Steps: 1. Select `director` 2. Replace Values 3. Replace null with `"Not Available"` OR Filter out blanks if required. --- ## 8️⃣ Why Change Data Types? * Ensures correct calculations * Enables proper visualizations * Improves performance * Prevents errors in DAX Example: `date_added` must be Date type. --- ## 9️⃣ Problem with Duration Column It contains: * β€œ90 min” * β€œ2 Seasons” This mixes numeric + text β†’ cannot calculate properly. ### Solution: Split into: * Duration_Number * Duration_Type Steps: 1. Select duration 2. Split column by delimiter (space) 3. Rename columns --- ## πŸ”Ÿ Why Split listed_in? It contains multiple genres in one cell: Example: β€œDrama, Romantic, International” Splitting helps: * Better filtering * Accurate genre analysis Steps: 1. Select listed_in 2. Split by delimiter (,) 3. Use β€œSplit into rows” --- # πŸ”Ή SECTION C – Basic Calculations (DAX) Go to **Report View** --- ## 1️⃣1️⃣ Count Total Titles Go to Modeling β†’ New Measure ```DAX Total Titles = COUNTROWS('netflix_titles') ``` Use in Card Visual. --- ## 1️⃣2️⃣ Number of Movies Only ```DAX Total Movies = CALCULATE( COUNTROWS('netflix_titles'), 'netflix_titles'[type] = "Movie" ) ``` --- ## 1️⃣3️⃣ How Slicers Help? Steps to add slicer: 1. Insert β†’ Slicer 2. Drag `release_year` or `country` They: * Filter all visuals * Make dashboard interactive --- ## 1️⃣4️⃣ Column vs Measure | Column | Measure | | --------------- | ---------------------- | | Stored in table | Calculated dynamically | | Static | Changes with filters | | Row-level | Aggregated | --- ## 1️⃣5️⃣ Why KPI Cards? They: * Show important numbers * Easy to understand * Highlight performance Example: * Total Titles * Total Movies * Total TV Shows --- # πŸ”Ή SECTION D – Visualization & Interpretation --- ## 1️⃣6️⃣ Best Visual for Trend Over Years πŸ“Š Line Chart Steps: 1. Insert Line Chart 2. Axis β†’ release_year 3. Values β†’ Total Titles Reason: Shows growth clearly over time. --- ## 1️⃣7️⃣ Compare Movies vs TV Shows πŸ“Š Clustered Column Chart or Pie Chart Steps: 1. Axis β†’ type 2. Values β†’ Count of title --- ## 1️⃣8️⃣ Three Dashboard Design Practices βœ” Use consistent colors βœ” Avoid clutter βœ” Align visuals properly βœ” Use meaningful titles --- ## 1️⃣9️⃣ If Most Titles are TV-MA? It indicates: * Platform focuses on mature/adult audience * More adult content * Not primarily child-focused --- ## 2️⃣0️⃣ Three Business Insights from Netflix Dataset 1. Netflix content increased rapidly after 2015. 2. Movies are more than TV Shows. 3. Most content is rated TV-MA β†’ adult-focused audience. --- # 🎯 Final Dashboard Structure (For Exam) Your dashboard should include: βœ” KPI Cards: * Total Titles * Total Movies * Total TV Shows βœ” Line Chart: * Titles by Release Year βœ” Bar Chart: * Movies vs TV Shows βœ” Bar Chart: * Top 10 Countries βœ” Slicer: * Release Year * Type * Country --- Vaishnavi, below are the answers written in **proper Power BI exam format** (the way you should write in your RGCMS paper). You can directly copy this style in exam. πŸ‘‡ --- # πŸ“Š NETFLIX DATASET – POWER BI ANSWERS FORMAT --- # πŸ”Ή SECTION A – Dataset Understanding ### 1. How many rows and columns are in the dataset? After loading the dataset in Power BI: * Go to **Data View** * Total rows: 8807 * Total columns: 12 --- ### 2. Which columns are categorical? Categorical columns are text-based fields: * type * title * director * cast * country * rating * listed_in * description --- ### 3. Which column contains date information? The column containing date information is: πŸ‘‰ `date_added` (Data type changed to Date in Column Tools.) --- ### 4. Difference between release_year and date_added * `release_year` β†’ Year when the movie/TV show was originally released. * `date_added` β†’ Date when content was added to Netflix platform. --- ### 5. Why might some columns contain blank values? * Missing information * Data entry errors * Information not available * Not applicable (e.g., director missing) --- # πŸ”Ή SECTION B – Data Cleaning & Preparation --- ### 6. Four data cleaning steps performed: 1. Removed duplicate rows 2. Changed data types correctly 3. Replaced null values 4. Split multi-value columns (listed_in, cast) --- ### 7. Handling missing values in director column In Power Query: * Select director column * Replace null values with β€œNot Available” OR * Filter out blank rows if required. --- ### 8. Why should we change data types correctly? * Ensures accurate calculations * Prevents errors in visuals * Improves performance * Enables correct DAX formulas Example: date_added must be Date type. --- ### 9. Why is mixed duration column a problem? The duration column contains: * β€œ90 min” * β€œ2 Seasons” This mixes text and numeric values. Problem: * Cannot perform numerical analysis. * Causes incorrect aggregations. Solution: * Split into Duration_Number and Duration_Type. --- ### 10. Purpose of splitting listed_in The listed_in column contains multiple genres separated by commas. Splitting helps: * Better filtering * Genre-wise analysis * Accurate visualizations --- # πŸ”Ή SECTION C – Basic Calculations (DAX) --- ### 11. DAX formula to count total titles ```DAX Total Titles = COUNTROWS('netflix_titles') ``` --- ### 12. Calculate number of movies only ```DAX Total Movies = CALCULATE( COUNTROWS('netflix_titles'), 'netflix_titles'[type] = "Movie" ) ``` --- ### 13. How do slicers help in dashboards? Slicers: * Filter data interactively * Allow dynamic report analysis * Improve user experience Example: Slicer on release_year filters all visuals. --- ### 14. Difference between Column and Measure | Column | Measure | | --------------------- | ---------------------- | | Calculated row by row | Calculated dynamically | | Stored in table | Not stored | | Static | Changes with filters | --- ### 15. Why are KPI cards useful? * Highlight key metrics * Easy to understand * Improve dashboard readability Example: * Total Titles * Total Movies * Total TV Shows --- # πŸ”Ή SECTION D – Visualization & Interpretation --- ### 16. Best visual to show trend over years Line Chart Reason: Shows increase/decrease over time clearly. Axis β†’ release_year Values β†’ Total Titles --- ### 17. Best visual to compare Movies vs TV Shows Clustered Column Chart Axis β†’ type Values β†’ Count of title --- ### 18. Three good practices for clean dashboard design 1. Use consistent color theme 2. Avoid overcrowding visuals 3. Add proper titles and labels --- ### 19. If most titles are TV-MA, what does it indicate? * Platform targets adult audience * More mature content * Less child-focused content --- ### 20. Three business insights 1. Netflix content increased rapidly after 2015. 2. Movies are more than TV Shows. 3. Majority content is rated TV-MA (adult audience focus). --- # 🎯 Recommended Dashboard Layout (Exam Ready) Top Row: * Card β†’ Total Titles * Card β†’ Total Movies * Card β†’ Total TV Shows Middle: * Line Chart β†’ Titles by Year * Column Chart β†’ Movies vs TV Shows Bottom: * Bar Chart β†’ Top Countries * Slicer β†’ Release Year --- If you want, I can now give you: βœ” 10 Viva questions with answers βœ” Advanced DAX formulas βœ” Step-by-step full dashboard creation explanation Tell me what you need next 😊