Skip to content

Fixes #21#68

Open
shaikmanu797 wants to merge 1 commit intoegen:masterfrom
shaikmanu797:master
Open

Fixes #21#68
shaikmanu797 wants to merge 1 commit intoegen:masterfrom
shaikmanu797:master

Conversation

@shaikmanu797
Copy link
Copy Markdown

@shaikmanu797 shaikmanu797 commented Jul 24, 2019

Reading the coalesce numPartitions value from custom SparkSession config key spark.sftp.coalesce.partitions, default numPartitions set to 1

package com.springml.spark.sftp

import org.apache.spark.sql.SparkSession

object Driver extends App {
  val spark: SparkSession = SparkSession.builder().master("local").getOrCreate()
  spark.conf.set(constants.coalescePartitionsConfKey, 4)

  val df = spark.read.
    format("com.springml.spark.sftp").
    option("host", "localhost").
    option("port", "2222").
    option("username", "foo").
    option("password", "pass").
    option("fileType", "csv").
    option("inferSchema", "true").
    load("/upload/airports.csv")
    .repartition(8)

  df.write.
    format("com.springml.spark.sftp").
    option("host", "localhost").
    option("port", "2222").
    option("username", "foo").
    option("password", "pass").
    option("fileType", "csv").
    option("delimiter", ";").
    save("/upload/")

  spark.close()
}

Test run log attached: test.log

@shaikmanu797
Copy link
Copy Markdown
Author

@samuel-pt, please review the PR and let me know if you think any additional changes should be made.

…parkSession config key spark.sftp.coalesce.partitions, default numPartitions set to 1
@marcraminv
Copy link
Copy Markdown

Ei @shaikmanu797 @samuel-pt Any forecast to have this feature merge to master? thank you 👍 !

@shaikmanu797
Copy link
Copy Markdown
Author

@marcraminv @fernandomora @vejeta @sunayansaikia @ezra-at-lumedic

Since this PR has not been reviewed for more than a year now and the contributors in this repository seems to be inactive for a long time.

I had to go with my own implementation of the API to fix the issue. Feel free to take a look at the package and any feedback is highly appreciated.

https://github.com/arcizon/spark-filetransfer

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants