I am trying to use spark and I am stuck on the reading the data..
here is my code..
x
df = SQLContext.read.format('com.databricks.spark.csv')
.options(header='true', inferSchema='true')
.load('C:/....')
and the error message says that
‘property’ object has no attribute ‘format’
So I think there are something wrong with format.. I tried to read the code of spark but it was just too hard.
I will really appreciate if anybody helps me a little.
Advertisement
Answer
Try below code.
from pyspark.sql import SparkSession
spark = SparkSession.builder.master("local").appName("sample").getOrCreate()
df = spark
.read
.format("csv")
.options(header = 'true', inferSchema = 'true')
.load("D:\tmp\data\sample")
My Spark Version 2.4.3
λ pyspark
Python 3.7.3 (v3.7.3:ef4ec6ed12, Mar 25 2019, 21:26:53) [MSC v.1916 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
Welcome to
____ __
/ __/__ ___ _____/ /__
_ / _ / _ `/ __/ '_/
/__ / .__/_,_/_/ /_/_ version 2.4.3
/_/
Using Python version 3.7.3 (v3.7.3:ef4ec6ed12, Mar 25 2019 21:26:53)
SparkSession available as 'spark'.
>>> from pyspark.sql import SparkSession
>>> spark = SparkSession.builder.master("local").appName("sample").getOrCreate()
>>> df = spark.read.format("csv").options(header = 'true', inferSchema = 'true').load("D:\tmp\data\sample\country_codes.csv")
>>> df.show()
+-------------------+-----+-----+
| country_name|iso_2|iso_3|
+-------------------+-----+-----+
| Afghanistan| AF| AFG|
| �land Islands| AX| ALA|
| Albania| AL| ALB|
| Algeria| DZ| DZA|
| American Samoa| AS| ASM|
| Andorra| AD| AND|
| Angola| AO| AGO|
| Anguilla| AI| AIA|
| Antarctica| AQ| ATA|
|Antigua and Barbuda| AG| ATG|
| Argentina| AR| ARG|
| Armenia| AM| ARM|
| Aruba| AW| ABW|
| Australia| AU| AUS|
| Austria| AT| AUT|
| Azerbaijan| AZ| AZE|
| Bahamas| BS| BHS|
| Bahrain| BH| BHR|
| Bangladesh| BD| BGD|
| Barbados| BB| BRB|
+-------------------+-----+-----+
only showing top 20 rows