Not sure why the escaped single quote doesn’t appear in the SQL output. Initially tried this in Jupyter notebook, but reproduced it in PySpark shell below.
$ pyspark SPARK_MAJOR_VERSION is set to 2, using Spark2 Python 3.6.3 |Anaconda custom (64-bit)| (default, Oct 13 2017, 12:02:49) [GCC 7.2.0] on linux Using Python version 3.6.3 (default, Oct 13 2017 12:02:49) SparkSession available as 'spark'. >>> spark.version '2.3.0.2.6.5.0-292' >>> spark.sql("select 'This is Bob''s home' as column1").show(truncate=False) +-----------------+ | column1| +-----------------+ |This is Bobs home| +-----------------+
Output shows Bobs home
instead of Bob's home
Advertisement
Answer
Use backslash instead of a single quote to escape a single quote:
spark.sql("select 'This is Bob\'s home' as column1").show() +------------------+ | column1| +------------------+ |This is Bob's home| +------------------+
Alternatively, you can use double quotes to surround the string, so that you don’t need to escape the single quote:
spark.sql("""select "This is Bob's home" as column1""").show() +------------------+ | column1| +------------------+ |This is Bob's home| +------------------+