Skip to content

Latest commit

 

History

History
138 lines (110 loc) · 4.95 KB

File metadata and controls

138 lines (110 loc) · 4.95 KB
description
The generic read function loads a tabular data using the Frictionless specification.

read

danfo.read(source, configs) [source]

Parameters Type Description Default
source string A path to the file/resources. It can be a local file, a URL to tabular data (CSV, EXCEL) or Datahub.io Data Resource.
configs object

Configuration options. Supported params are:

data_num (Defaults => 0): The specific dataset to load, when reading data from a datapackage.json,

header (Defaults => true): Whether the dataset contains a header or not.

sheet (Defaults => 0): Number of the excel sheet which u want to load.

}

Returns:

 ****_**Promise**_. Resolves to DataFrame

The read function uses frictionless.js underhood.frictionless.js is a lightweight, standardized "stream-plus-metadata" interface for accessing files and datasets, especially tabular ones (CSV, Excel). It follows the Frictionless spec

Note: The read method is only available in danfojs-node at the moment.

Read a CSV file

{% tabs %} {% tab title="Node.js" %}

const dfd = require("danfojs-node")

async function load_data() {
    let df = await dfd.read("file.csv")
    let sample = await df.sample(10)
    sample.print()
}

load_data()

{% endtab %} {% endtabs %}

Loading Files from URL

By specifying a valid URL, you can load CSV/EXCEL file:

{% tabs %} {% tab title="Node.js" %}

const dfd = require("danfojs-node")

async function load_data() {
 let df = await dfd.read("https://raw.githubusercontent.com/plotly/datasets/master/finance-charts-apple.csv")
 df.head().print()
}

load_data()

{% endtab %} {% endtabs %}

Loading Data from a Data Package Descriptor

You can load data from a frictionless data package descriptor. A data package descriptor is a central file in a Data Package. It is a JSON file that provides:

  • General metadata such as the package’s title, license, publisher etc
  • A list of the data “resources” that make up the package including their location on disk or online and other relevant information (including, possibly, schema information about these data resources in a structured form)

For instance, in the example below, we load the first resource in the Natural Gas dataset from datahub.io.

{% tabs %} {% tab title="JavaScript" %}

const dfd = require("danfojs-node")

async function load_data() {
 const package_url =
    "https://datahub.io/core/natural-gas/datapackage.json";

  const df = await dfd.read(package_url, { data_num: 1 });
  df.head().print();

load_data()

{% endtab %} {% endtabs %}

╔═══╤═══════════════════╤═══════════════════╗
║   │ Date              │ Price             ║
╟───┼───────────────────┼───────────────────╢
║ 0 │ 1997-01-07        │ 3.81999999999...  ║
╟───┼───────────────────┼───────────────────╢
║ 1 │ 1997-01-08        │ 3.79999999999...  ║
╟───┼───────────────────┼───────────────────╢
║ 2 │ 1997-01-09        │ 3.60999999999...  ║
╟───┼───────────────────┼───────────────────╢
║ 3 │ 1997-01-10        │ 3.91999999999...  ║
╟───┼───────────────────┼───────────────────╢
║ 4 │ 1997-01-13        │ 4                 ║
╚═══╧═══════════════════╧═══════════════════╝