Releases: MarquezProject/marquez
Releases · MarquezProject/marquez
Marquez 0.22.0
Added
- Add support for
LifecycleStateChangeFacet
with an ability to softly delete datasets #1847 @pawel-big-lebowski - Enable pod specific annotations in Marquez Helm Chart via marquez.podAnnotations #1945 @wslulciuc
- Add support for job renaming/redirection via symlink #1947 @collado-mike
- Add Created by view for dataset versions along with SQL syntax highlighting in web UI #1929 @phixMe
- Add
operationId
to openapi spec #1978 @phixMe
Changed
- Upgrade Flyway to v7.6.0 #1974 @dakshin-k
Fixed
- Remove size limits on namespaces, dataset names, and and source connection urls #1925 @collado-mike
- Update namespace names to allow
=
,@
, and;
#1936 @mobuchowski - Time duration display in web UI #1950 @phixMe
- Enable web UI to access API via Helm Chart @GZack2000
Marquez 0.21.0
Added
- Add MDC to the
LoggingMdcFilter
to include API method, path, and request ID @fm100 - Add Postgres sub-chart to Helm deployment for easier installation option @KevinMellott91
- GitHub Action workflow to validate changes to Helm chart @KevinMellott91
Changed
- Upgrade from
Java11
toJava17
@ucg8j - Switch JDK image from alpine to
temurin
enabling Marquez to run on multiple CPU architectures @ucg8j
Fixed
- Error when running Marquez on Apple M1 @ucg8j
Removed
-
The
/api/v1-beta/lineage
endpoint @wslulciuc -
The
marquez-airflow
lib. has been removed, Please use theopenlineage-airflow
library instead. To migrate to usingopenlineage-airflow
, make the following changes @wslulciuc:# Update the import in your DAG definitions -from marquez_airflow import DAG +from openlineage.airflow import DAG
# Update the following environment variables in your Airflow instance -MARQUEZ_URL +OPENLINEAGE_URL -MARQUEZ_NAMESPACE +OPENLINEAGE_NAMESPACE
-
The
marquez-spark
lib. has been removed. Please use theopenlineage-spark
library instead. To migrate to usingopenlineage-spark
, make the following changes @wslulciuc:SparkSession.builder() - .config("spark.jars.packages", "io.github.marquezproject:marquez-spark:0.20.+") + .config("spark.jars.packages", "io.openlineage:openlineage-spark:0.2.+") - .config("spark.extraListeners", "marquez.spark.agent.SparkListener") + .config("spark.extraListeners", "io.openlineage.spark.agent.OpenLineageSparkListener") .config("spark.openlineage.host", "https://api.demo.datakin.com") .config("spark.openlineage.apiKey", "your datakin api key") .config("spark.openlineage.namespace", "<NAMESPACE_NAME>") .getOrCreate()
Marquez 0.20.0
Added
- Add deploy docs for running Marquez on AWS @wslulciuc @merobi-hub
Changed
- Clarify docs on using OpenLineage for metadata collection @fm100
- Upgrade to gradle
7.x
@wslulciuc - Use
eclipse-temurin
for Marquez API base docker image @fm100
Deprecated
- The following endpoints have been deprecated and are scheduled to be removed in
0.25.0
. Please use the/lineage
endpoint when collecting source, dataset, and job metadata @wslulciuc:
Fixed
- Validation of OpenLineage events on write @collado-mike
- Increase
name
column size for tablesnamespaces
andsources
@mmeasic
Security
Marquez 0.19.1
Fixed
- URI and URL DB mappper should handle empty string as null @OleksandrDvornik
- Fix NodeId parsing when dataset name contains
struct<>
@fm100 - Add encoding for dataset names in URL construction @collado-mike
Marquez 0.19.0
Added
- Add simple python client example @wslulciuc
- Display dataset versions in web UI 🎉 @phixMe
- Display runs and run facets in web UI 🎉 @phixMe
- Facet formatting and highlighting as Json in web UI @phixMe
- Add option for
docker/up.sh
to run in the background @rossturk - Return
totalCount
in lists of jobs and datatsets @phixMe
Changed
- Change type column in
dataset_fields
table toTEXT
@wslulciuc - Set
ZonedDateTime
parsing to support optional offsets and default to server timezone @collado-mike
Fixed
Job.location
andSource.connectionUrl
should be in URI format on write @OleksandrDvornik- Z-Index fix for nodes and edges in lineage graph @phixMe
- Format of the index files for web UI @phixMe
- Fix OpenLineage API to return correct response codes for exceptions propagated from async calls @collado-mike
- Stopped overwriting nominal time information with nulls @mobuchowski
Removed
WriteOnly
clients forjava
andpython
. Before OpenLineage, we added aWriteOnly
implementation to our clients to emit calls to a backend. Abackend
enabled collecting raw HTTP requests to an HTTP endpoint, console, or file. This was our way of capturing lineage events that could then be used to automatically create resources on the Marquez backend. We soon worked on a standard that eventually became OpenLineage. That is, OpenLineage removed the need to make individual calls to create a namespace, a source, a datasets, etc, but rather accept an event with metadata that the backend could process. @wslulciuc
Marquez 0.18.0
Added
- New Add Search API 🎉 @wslulciuc
- Add
.env.example
to override variables defined in docker-compose files @wslulciuc
Changed
- Add openlineage-java as dependency @OleksandrDvornik
- Move class SentryConfig from
marquez
tomarquez.tracing
pkg - Major UI improvements; the UI now uses the Search and Lineage APIs 🎉 @phixMe
- Set default API port to
8080
when running the Marquez shadowjar
@wslulciuc
Fixed
- Update
examples/airflow
to useopenlineage-airflow
and fix the SQL in DAG troubleshooting step @wslulciuc
Removed
- Drop
job_versions_io_mapping_inputs
andjob_versions_io_mapping_outputs
tables @OleksandrDvornik
Marquez 0.17.0
Changed
- Updated Lineage runs query to improve performance, added tests @collado-mike
- Add POST
/api/v1/lineage
endpoint to docs and deprecate run endpoints @wslulciuc - Drop
FieldType
enum @wslulciuc
Deprecated
- Run API endpoints that create or modify a job run (scheduled to be removed in
0.19.0
). Please use the POST/api/v1/lineage
endpoint when collecting job run metadata. @wslulciuc - Airflow integration, please use the
openlineage-airflow
library instead. @wslulciuc - Spark integration, please use the
openlineage-spark
library instead. @wslulciuc - Write only clients for
java
andpython
(scheduled to be removed in0.19.0
) @wslulciuc
Removed
- Dbt integration lib. @wslulciuc
- Common integration lib. @wslulciuc
Marquez 0.16.1
Fixed
- dbt packages should look for namespace packages @mobuchowski
- Add common integration dependency to dbt plugins @mobuchowski
DatasetVersionDao
queries missing input and output facets @dominiquetipton- (De)serialization issue for
Run
andJobData
models @collado-mike - Prefix spark
openlineage.*
configuration parameters withspark.*
@collado-mike - Parse multi-statement sql in class
SqlParser
used in Airflow integration @wslulciuc - URL-encode namespace on calls to API backend @phixMe
Marquez 0.16.0
Added
- New Add JobVersion API 🎉 @collado-mike
- New Add DBT integrations for BigQuery and Snowflake 🎉 @mobuchowski
Changed
- Reverted delete of BigQueryNodeVisitor to work with vanilla SparkListener @collado-mike
- Promote Lineage API out of beta @OleksandrDvornik
Fixed
- Display job SQL in UI @phixMe
- Allow upsert of tags @hanbei
- Allow potentially ambiguous URIs with encoded path segments @mobuchowski
- Use source naming convetion defined by OpenLineage @mobuchowski
- Return dataset facets @collado-mike
- BigQuery source naming in integrations @mobuchowski
Marquez 0.15.2
Added
- Add endpoint to create tags @hanbei
Fixed
- Fixed build & release process for python marquez-integration-common package @collado-mike
- Fixed snowflake and bigquery errors when connector libraries not loaded @collado-mike
- Fixed Openlineage API does not set Dataset current_version_uuid #1361 @collado-mike