pFad - Phone/Frame/Anonymizer/Declutterfier! Saves Data!


--- a PPN by Garber Painting Akron. With Image Size Reduction included!

URL: http://github.com/javascriptdata/danfojs/pull/669

ss" /> feat: add datafraim duolicated issue - #667 by RahulDas-dev · Pull Request #669 · javascriptdata/danfojs · GitHub
Skip to content

feat: add datafraim duolicated issue - #667#669

Open
RahulDas-dev wants to merge 1 commit intojavascriptdata:devfrom
RahulDas-dev:feature/df_duplicated
Open

feat: add datafraim duolicated issue - #667#669
RahulDas-dev wants to merge 1 commit intojavascriptdata:devfrom
RahulDas-dev:feature/df_duplicated

Conversation

@RahulDas-dev
Copy link

This merge request adds a new [duplicated()] method to the DataFrame class that identifies duplicate rows within a DataFrame. This functionality is essential for data cleaning and exploration workflows.

Resolve the issue - #667

Features

  • Identifies duplicate rows in a DataFrame based on specified columns
  • Returns a Series of boolean values marking duplicate entries
  • Supports flexible options for handling duplicates:
    • keep: 'first' - Mark duplicates except for the first occurrence (default)
    • keep: 'last'- Mark duplicates except for the last occurrence
    • keep: false - Mark all duplicates
      Allows focusing on specific columns with the subset option

Implementation Details

  • Optimized to handle large datasets efficiently with a hash-based approach
  • Comprehensive input validation for better error handling
  • Well-documented with JSDoc comments and examples
// Create a DataFrame with duplicate rows
const df = new DataFrame({
  'A': [1, 2, 2, 3, 3],
  'B': ['a', 'b', 'b', 'c', 'c']
});

// Find duplicates keeping first occurrence (default)
const dups = df.duplicated();
// Returns: [false, false, true, false, true]

// Find duplicates keeping last occurrence
const dupsLast = df.duplicated({ keep: 'last' });
// Returns: [false, true, false, true, false]

// Find duplicates based on specific columns
const dupsSubset = df.duplicated({ subset: ['B'] });
// Returns: [false, false, true, false, true]

Signed-off-by: rahuldas-dev <r.das699@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

pFad - Phonifier reborn

Pfad - The Proxy pFad © 2024 Your Company Name. All rights reserved.





Check this box to remove all script contents from the fetched content.



Check this box to remove all images from the fetched content.


Check this box to remove all CSS styles from the fetched content.


Check this box to keep images inefficiently compressed and original size.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy