Apache NiFi Adds Python Processor Support |
Written by Kay Ewbank | |||
Tuesday, 09 July 2024 | |||
Apache NiFi 2, a project for processing and distributing data, has been released with support for Python processors in the MiNiFi framework, and a completely rebuilt user interface. Apache NiFi is based on the NiagaraFiles software developed by the US National Security Agency (NSA), which was open sourced in 2014. The name NiFi derives from Niagara Files. NiFi can be used to automate the flow of data between software systems, and it uses ETL (extract, transform, load), along with the ability to operate within clusters and security based on TLS encryption. NiFi primarily serves as the consumer between Kafka and HDFS. NiFi also provides schema validation for event streams while enabling the flows to modify and republish secure event streams for general use. It can also be used to monitor data flows and identify potential problems, and for securing data flows by encrypting data at rest and in transit. NiFi executes within a JVM on a host operating system. Its primary components start with a web server that hosts NiFi's HTTP-based command and control API. There's a flow controller that provides threads for extensions to run on, and manages the schedule of when extensions receive resources to execute. NiFi uses the concept of FlowFiles that represent objects moving through the system. For each FlowFile, NiFi keeps track of a map of key/value pair attribute strings and its associated content of zero or more bytes. The state of active FlowFiles are stored in a FlowFile Repository. There's also a content repository that stores the actual content bytes of a given FlowFile, and a provenance repository where all provenance event data is NiFi is extensible by developers, and the extensions operate and execute within the JVM. This release of NiFi has a rebuilt user interface that lets the system or the user select a dark mode. More usefully, it now supports Kafka 3 for both consumption and publishing with Kafka. The NiFi team says this version can now split binary Packet Capture (PCAP) with SplitPCAP, and Microsoft Excel XLSX files can be split to individual sheets with SplitExcel. They say this is: "a good example of the increasingly common usage of NiFi in the wild to capture and transform unstructured or semi-structured data and deliver it to systems such as databases, vector stores, and more." There's also a new interface for Python extensions supporting components which source new data. Still on the Python front, there's now Python Processor support in the MiNiFi framework. NiFi 2 is available now. More InformationRelated ArticlesCloudera And StreamNative Open Source NiFi Pulsar Connector Apache Daffodil Improves DFDL Compatibility To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.
Comments
or email your comment to: comments@i-programmer.info |
|||
Last Updated ( Tuesday, 09 July 2024 ) |