What Features make Snowflake data architecture great?
If the data consumer does not have a Snowflake account, we may create a reader account and share data with them without forcing them to become a Snowflake client. All shared database objects are read-only, preventing data consumers from altering the data.
Data is not duplicated or moved between accounts in secure data sharing; this implies that shared data does not occupy any storage space in a consumer account. Only the computational resources needed to query the shared data are charged to users. The supplier will be charged for the compute resources utilized by readers under a reader account.
Data sharing between accounts
Snowflake’s secure data sharing functionality allows you to exchange objects from a database in your account with another Snowflake account without having to duplicate the data. As a result, the shared data does not require additional storage and does not contribute to the data consumer’s storage expenses. Because data exchange is done through Snowflake’s metadata store, the setup is simple, and data consumers have immediate access to the data.
Snowflake’s design allows for the creation of a network of data producers and data consumers that may be used for a variety of purposes. One of these is the Snowflake data marketplace, which links suppliers with users who wish to exchange free or paid data. Consumers may access shared data directly in their accounts, which they can query and combine with other data sources as needed. Another Snowflake use case is data exchange, which allows users to collaborate on data with invited members. This simplifies use cases such as data exchange between corporate customers, suppliers, and partners.
Data Sharing in the Old Ways
Data sharing in the traditional sense is akin to handing someone a jigsaw puzzle of your data without the box that reveals the final product.
Traditionally, providers have had to break down their databases into distinct files. The files must then be transferred from the provider to the customer. FTP, S3, external hard drives, and other traditional data exchange techniques exist. Before the data can be seen or utilized, the consumer must reconstruct the database after receiving the files.
Data pipelines that are always running
Many of the manual tasks required in importing data into Snowflake tables and subsequently processing the data for further analysis are automated by continuous data pipelines. Snowflake has a number of capabilities that allow for continuous data intake, change data tracking and the creation of repeating actions in order to create continuous data pipeline processes.
Semi-structured data is supported
Snowflake’s ability to mix structured and semi-structured data without the usage of complicated technologies like Hadoop or Hive is one of the most significant steps toward big data. Data can originate from a variety of places, including machine-generated data, sensors, and mobile devices. Snowflake enables semi-structured data intake in a variety of formats, including JSON, Avro, ORC, Parquet, and XML, using the VARIANT data type, which has a size limit of 16MB. India Snowflake Partners also improve the data by removing as much as possible in columnar format and consolidating the rest into a single column.