How will you define Schema explicitly?
from pyspark.sql import SparkSession
from pyspark.sql.types import StructType, StructField, DateType, StringType, IntegerType

from lib.logger import Log4j

if __name__ == "__main__":
    spark = SparkSession \
        .builder \
        .master("local[3]") \
        .appName("SparkSchemaDemo") \

    logger = Log4j(spark)

    inputSchemaStruct = StructType([
        StructField("col1", DateType()),
        StructField("col2", StringType()),
        StructField("col3", IntegerType()),
        StructField("col4", IntegerType())

    inputSchemaDDL = """col1 DATE, col2 STRING, col3 INT, col4 INT"""
	inputTimeCsvDF = \
        .format("csv") \
        .option("header", "true") \
        .schema(inputSchemaStruct) \
        .option("mode", "FAILFAST") \
        .option("dateFormat", "M/d/y") \
        .load("data/input*.csv")"CSV Schema:" + inputTimeCsvDF.schema.simpleString())
What is Spark DataFrameWriter API?

DataFrameWriter is the interface to describe data (as the result of executing a structured query) should be saved to an external data source. DataFrameWriter defaults to parquet data source format.
You can change the default format using spark.

Its General structure is :
. save ()

Take an example:
.mode (saveMode)
.option(”path”, “/data/cloud”)
. save ()

Author: CloudVikas