How will you define Schema explicitly and what is Spark DataFrameWriter API?

How will you define Schema explicitly?
	
from pyspark.sql import SparkSession
from pyspark.sql.types import StructType, StructField, DateType, StringType, IntegerType

from lib.logger import Log4j

if __name__ == "__main__":
    spark = SparkSession \
        .builder \
        .master("local[3]") \
        .appName("SparkSchemaDemo") \
        .getOrCreate()

    logger = Log4j(spark)

    inputSchemaStruct = StructType([
        StructField("col1", DateType()),
        StructField("col2", StringType()),
        StructField("col3", IntegerType()),
        StructField("col4", IntegerType())
    ])

    inputSchemaDDL = """col1 DATE, col2 STRING, col3 INT, col4 INT"""
	
	inputTimeCsvDF = spark.read \
        .format("csv") \
        .option("header", "true") \
        .schema(inputSchemaStruct) \
        .option("mode", "FAILFAST") \
        .option("dateFormat", "M/d/y") \
        .load("data/input*.csv")

    inputTimeCsvDF.show(5)
    logger.info("CSV Schema:" + inputTimeCsvDF.schema.simpleString())
	
What is Spark DataFrameWriter API?

DataFrameWriter is the interface to describe data (as the result of executing a structured query) should be saved to an external data source. DataFrameWriter defaults to parquet data source format.
You can change the default format using spark.

Its General structure is :
DataFrameWriter
.format(…)
.option(…)
.partitionBy(…)
.bucketBy(…)
.sortBy(…)
. save ()

Take an example:
dataframe.write
.format("parquet")
.mode (saveMode)
.option(”path”, “/data/cloud”)
. save ()

Author: CloudVikas