diff --git a/OCTPDBA-380 - POC data anonymization with Data Masker by RedGate tool/.vs/OCTPDBA-380 - POC data anonymization with Data Masker by RedGate tool/v15/.ssms_suo b/OCTPDBA-380 - POC data anonymization with Data Masker by RedGate tool/.vs/OCTPDBA-380 - POC data anonymization with Data Masker by RedGate tool/v15/.ssms_suo new file mode 100644 index 0000000..2acc2f2 Binary files /dev/null and b/OCTPDBA-380 - POC data anonymization with Data Masker by RedGate tool/.vs/OCTPDBA-380 - POC data anonymization with Data Masker by RedGate tool/v15/.ssms_suo differ diff --git a/OCTPDBA-380 - POC data anonymization with Data Masker by RedGate tool/OCTPDBA-380 - POC data anonymization with Data Masker by RedGate tool.ssmssln b/OCTPDBA-380 - POC data anonymization with Data Masker by RedGate tool/OCTPDBA-380 - POC data anonymization with Data Masker by RedGate tool.ssmssln new file mode 100644 index 0000000..9310d15 --- /dev/null +++ b/OCTPDBA-380 - POC data anonymization with Data Masker by RedGate tool/OCTPDBA-380 - POC data anonymization with Data Masker by RedGate tool.ssmssln @@ -0,0 +1,22 @@ + +Microsoft Visual Studio Solution File, Format Version 12.00 +# SQL Server Management Studio Solution File, Format Version 18.00 +VisualStudioVersion = 15.0.28307.421 +MinimumVisualStudioVersion = 10.0.40219.1 +Project("{4F2E2C19-372F-40D8-9FA7-9D2138C6997A}") = "OCTPDBA-380 - POC data anonymization with Data Masker by RedGate tool", "OCTPDBA-380 - POC data anonymization with Data Masker by RedGate tool.ssmssqlproj", "{29821CEB-6CB3-403C-BBA9-B6A97BB97946}" +EndProject +Global + GlobalSection(SolutionConfigurationPlatforms) = preSolution + Default|Default = Default|Default + EndGlobalSection + GlobalSection(ProjectConfigurationPlatforms) = postSolution + {29821CEB-6CB3-403C-BBA9-B6A97BB97946}.Default|Default.ActiveCfg = Default + {CDE4914C-2B09-495E-8198-B9A93CFE6162}.Default|Default.ActiveCfg = Default + EndGlobalSection + GlobalSection(SolutionProperties) = preSolution + HideSolutionNode = FALSE + EndGlobalSection + GlobalSection(ExtensibilityGlobals) = postSolution + SolutionGuid = {3379C60C-77A0-4614-8BAD-C86F249B7F52} + EndGlobalSection +EndGlobal diff --git a/OCTPDBA-380 - POC data anonymization with Data Masker by RedGate tool/OCTPDBA-380 - POC data anonymization with Data Masker by RedGate tool.ssmssqlproj b/OCTPDBA-380 - POC data anonymization with Data Masker by RedGate tool/OCTPDBA-380 - POC data anonymization with Data Masker by RedGate tool.ssmssqlproj new file mode 100644 index 0000000..25cfc75 --- /dev/null +++ b/OCTPDBA-380 - POC data anonymization with Data Masker by RedGate tool/OCTPDBA-380 - POC data anonymization with Data Masker by RedGate tool.ssmssqlproj @@ -0,0 +1,9 @@ + + + + + + + + + \ No newline at end of file diff --git a/OCTPDBA-380 - POC data anonymization with Data Masker by RedGate tool/analyze.txt b/OCTPDBA-380 - POC data anonymization with Data Masker by RedGate tool/analyze.txt new file mode 100644 index 0000000..661e529 --- /dev/null +++ b/OCTPDBA-380 - POC data anonymization with Data Masker by RedGate tool/analyze.txt @@ -0,0 +1,32 @@ +ressources: + https://www.red-gate.com/hub/university/courses/product-training?tool=data-masker&level=get-started + getting started videos from red-gate directly + https://www.red-gate.com/hub/university/courses/sql-data-catalog/end-to-end-data-protection-with-sql-data-catalog-and-sql-provision + end-to-end video showing the process of Cataloging, Masking and Cloning + + https://learn.microsoft.com/en-us/sql/relational-databases/security/dynamic-data-masking?view=sql-server-ver16 + sql server >= 2016 implémente aussi du data masking + + +sites intéressants: + https://plantbasedsql.com/tag/data-masking/ + un ex de red-gate, qui a travaillé sur data-masker et parle du process + +bcp d'exemples font usage de red-gate sql data catalog + https://www.red-gate.com/products/dba/sql-data-catalog/ + + SQL Data Catalog allows users to catalog their SQL Server data estate by applying classifications, as tags and free-text labels, to SQL Server objects. The taxonomy of tags and attributes to be applied is also created and managed by this product. A common use case for the tool is for classifying columns by their sensitivity under data privacy regulations such as the GDPR. + +et peut-être plus intéressant est le package de "SQL provision": + SQL Provision is a solution for (compliant) test data management that combines two Redgate products into a single offering: Data Masker for static data masking, and SQL Clone for database cloning and provisioning. + +terminologie: + Static data masking is the process of de-identifying sensitive data-at-rest within the tables of your Database. It is typically used to provide realistic, Production-like data into non-Production environments like Dev and Test, and even sets that are given to 3rd parties. This relies on retaining non-sensitive business specific fields within rows and taking anything considered PII (Personally Identifiable Information) or PHI (Protected Health Information) and either scrambling or replacing it with similar but ultimately false data. + + Deterministic data masking is the process of masking data with values in a repeatable way, such that it will give the same value when masked in any and all future runs on any value that matches and will create a new record for values which have not been previously masked. An example of this would be if you were to mask “Chris Unwin” to “Brad Pitt”, it should appear as “Brad Pitt” not only in our (for example) dbo.Contacts table but also all associated tables (regardless of PKFK relationships at the DB level) and every single run should provide the same output. This is useful for building up familiarity with the data and utilizing for future test runs. + +dans notre cas, c'est du deterministic qu'il faudra utiliser. +Pour ce faire, il faut: + +1. Identifier les champs et les tables qui sont à masquer +2. définir les règles de masquage \ No newline at end of file diff --git a/OCTPDBA-380 - POC data anonymization with Data Masker by RedGate tool/bloor-research-redgate-inbrief-2021.pdf b/OCTPDBA-380 - POC data anonymization with Data Masker by RedGate tool/bloor-research-redgate-inbrief-2021.pdf new file mode 100644 index 0000000..714463d Binary files /dev/null and b/OCTPDBA-380 - POC data anonymization with Data Masker by RedGate tool/bloor-research-redgate-inbrief-2021.pdf differ diff --git a/OCTPDBA-431 - Spike create numeric Pharmacode column to improve joins/.vs/OCTPDBA-431 - Spike create numeric Pharmacode column to improve joins/v15/.ssms_suo b/OCTPDBA-431 - Spike create numeric Pharmacode column to improve joins/.vs/OCTPDBA-431 - Spike create numeric Pharmacode column to improve joins/v15/.ssms_suo index 598ad99..61d0cfd 100644 Binary files a/OCTPDBA-431 - Spike create numeric Pharmacode column to improve joins/.vs/OCTPDBA-431 - Spike create numeric Pharmacode column to improve joins/v15/.ssms_suo and b/OCTPDBA-431 - Spike create numeric Pharmacode column to improve joins/.vs/OCTPDBA-431 - Spike create numeric Pharmacode column to improve joins/v15/.ssms_suo differ diff --git a/OCTPDBA-431 - Spike create numeric Pharmacode column to improve joins/OCTPDBA-431 - Spike create numeric Pharmacode column to improve joins.ssmssqlproj b/OCTPDBA-431 - Spike create numeric Pharmacode column to improve joins/OCTPDBA-431 - Spike create numeric Pharmacode column to improve joins.ssmssqlproj index 8dc1af6..fe58689 100644 --- a/OCTPDBA-431 - Spike create numeric Pharmacode column to improve joins/OCTPDBA-431 - Spike create numeric Pharmacode column to improve joins.ssmssqlproj +++ b/OCTPDBA-431 - Spike create numeric Pharmacode column to improve joins/OCTPDBA-431 - Spike create numeric Pharmacode column to improve joins.ssmssqlproj @@ -15,13 +15,25 @@ NotSpecified Microsoft SQL Server Management Studio - Query + + 2023-01-06T13:46:14.5715315+01:00 + SQL + suncent + + Windows Authentication + + 30 + 0 + NotSpecified + Microsoft SQL Server Management Studio - Query + - 8c91a03d-f9b4-46c0-a305-b5dcc79ff907:(local):True - (local) + 8c91a03d-f9b4-46c0-a305-b5dcc79ff907:suncent:True + suncent check data.sql diff --git a/OCTPDBA-431 - Spike create numeric Pharmacode column to improve joins/check data.sql b/OCTPDBA-431 - Spike create numeric Pharmacode column to improve joins/check data.sql index 78494aa..934cc79 100644 --- a/OCTPDBA-431 - Spike create numeric Pharmacode column to improve joins/check data.sql +++ b/OCTPDBA-431 - Spike create numeric Pharmacode column to improve joins/check data.sql @@ -1,25 +1,51 @@ USE [Arizona] +DECLARE @phidx VARCHAR(7) = '1065852'; -SELECT [PHGD_ACSC_PharmacodeNum] -FROM [dbo].[PHGD_ACSC] [pa] -WHERE TRY_CONVERT(INT,[pa].[PHGD_ACSC_PharmacodeNum]) IS NULL - -SELECT [pa].[PHGD_ACXI_PharmacodeNum] -FROM [dbo].[PHGD_ACXI] [pa] -WHERE TRY_CONVERT(INT,[pa].[PHGD_ACXI_PharmacodeNum]) IS NULL - -SELECT TOP 10 * +/* +--to find a phidx with several subs: +SELECT TOP 1000 + [ITK_key], COUNT(1) AS cnt FROM [dbo].[Item_key] [ik] -WHERE [ik].[ITK_type]=1 --phcode +WHERE [ik].[ITK_type] = 1 --phcode +AND EXISTS( + SELECT 1 + FROM [dbo].[PHGD_ACSC] [pa] + WHERE [pa].[PHGD_ACSC_PharmacodeNum] = [ik].[ITK_key] +) +AND EXISTS( + SELECT 1 + FROM [dbo].[PHGD_ACXI] [pa2] + WHERE [pa2].[PHGD_ACXI_PharmacodeNum] = [ik].[ITK_key] +) +GROUP BY [ik].[ITK_key] +HAVING COUNT(1)>1 +ORDER BY [ik].[ITK_key] DESC +; +*/ -SELECT TOP 10 - i.* +SELECT * +FROM [dbo].[PHGD_ACSC] [pa] +WHERE [pa].[PHGD_ACSC_PharmacodeNum]=@phidx + +SELECT * +FROM [dbo].[PHGD_ACXI] [pa] +WHERE [pa].[PHGD_ACXI_PharmacodeNum]=@phidx + +SELECT + [i].[Item_ID] ,[it].[ITTX_description] ,[it].[ITTX_language] - ,ik.[Item_key_ID] - ,ik.[ITK_key] - ,ik.[ITK_label_text] + ,[ik].[Item_key_ID] + ,[ik].[ITK_key] + ,[ik].[ITK_label_text] + ,[ik].[ITK_subsidiary] + ,[pi].* FROM [dbo].[Item] [i] - JOIN [dbo].[Item_text] [it] ON it.[ITTX_item] = i.[Item_ID] JOIN [dbo].[Item_key] [ik] ON ik.[ITK_item] = i.[Item_ID] + JOIN [dbo].[PH_item] [pi] ON [pi].[PHIT_item] = i.[Item_ID] + LEFT JOIN [dbo].[Item_text] [it] ON it.[ITTX_item] = i.[Item_ID] AND [it].[ITTX_language] = 1 +WHERE [ik].[ITK_type] = 1 --phcode +AND [ik].[ITK_key]=@phidx +ORDER BY [ik].[ITK_key] +; diff --git a/OCTPDBA-431 - Spike create numeric Pharmacode column to improve joins/todo.sql b/OCTPDBA-431 - Spike create numeric Pharmacode column to improve joins/todo.sql index c6075e0..681a144 100644 --- a/OCTPDBA-431 - Spike create numeric Pharmacode column to improve joins/todo.sql +++ b/OCTPDBA-431 - Spike create numeric Pharmacode column to improve joins/todo.sql @@ -29,4 +29,33 @@ Question: Add column Item_id and delete column Pharmacodenum * item_key.ITK_key est de type [dbo].[Normal_name_field], qui est un varchar(30), il n'y a donc pas de pharmacode au format entier en db + +Retour de la prsentation au team: + dfinir des scnarios et valuer avec Roger Gruetter les cas. + Il faut viter d'utiliser les phcode pour faire les liaisons. + Problmes si les subsidiaries rentrent en jeux, duplication des rsultats ? + + voir avec les devs si les problmes de perfs sont gnant pour les addresser, plutot que de revoir la structure. + +Scnarios: + !!!! dbo.item et dbo.ph_item sont subsidiarises via dbo.item_key !!! + Un lien entre dbo.ph_item ou dbo.item et les tables phgd_xxxx cre donc des doublons caus par les subs en centrale. + + Adding a "pharmacode" column in dbo.ph_item + + simplify the link + + as this is related to the pharmindex data only, does not implicate the subsidiaries. + + - could cause issues because dbo.ph_item is subsidiarized and multiple rows with the same pharmacode would exists in the centrals. + - We duplicate the pharmacode. The "official" one is in the table dbo.item_key + - We need to implement logic in triggers to maintain the satellite tables when the item_key value change / is created + - The pharmacode is not a value that is fixed in time. It can be re-used, and makes a bad key for joints as it might represent different articles over a time period. + + Adding a FK to ph_item in both table + this brings the same advantages and questions as the pharmacode, but with a stable relation that will not change in the time + + simplify the link + - could cause issues because dbo.ph_item is subsidiarized and multiple rows with the same pharmacode would exists in the centrals. + +voir avec Roland Berger pour ces question +ou alors avec le team de Claude Castella + */ \ No newline at end of file