Feed aggregator

PostgreSQL partitioning (4): Hash partitioning

Yann Neuhaus - Tue, 2019-06-04 01:00

The last partitioning strategy we will look at is: hash partitioning. If you missed the first posts in this series here they are:

  1. PostgreSQL partitioning (1): Preparing the data set
  2. PostgreSQL partitioning (2): Range partitioning
  3. PostgreSQL partitioning (3): List partitioning

Usually hash partitioning is used when you do not have a natural way of partitioning your data or you want to evenly distribute the data based on hash.

In PostgreSQL hash partitioning might seem a bit strange in the beginning because you define a modulus and a remainder and that defines where the data is going to. Having a look at our materialized view which contains our data set which columns might be a good candidate for that?

postgres=# \d mv_traffic_violations 
                 Materialized view "public.mv_traffic_violations"
         Column          |          Type          | Collation | Nullable | Default 
-------------------------+------------------------+-----------+----------+---------
 seqid                   | text                   |           |          | 
 date_of_stop            | date                   |           |          | 
 time_of_stop            | time without time zone |           |          | 
 agency                  | text                   |           |          | 
 subagency               | text                   |           |          | 
 description             | text                   |           |          | 
 location                | text                   |           |          | 
 latitude                | numeric                |           |          | 
 longitude               | numeric                |           |          | 
 accident                | text                   |           |          | 
 belts                   | boolean                |           |          | 
 personal_injury         | boolean                |           |          | 
 property_damage         | boolean                |           |          | 
 fatal                   | boolean                |           |          | 
 commercial_license      | boolean                |           |          | 
 hazmat                  | boolean                |           |          | 
 commercial_vehicle      | boolean                |           |          | 
 alcohol                 | boolean                |           |          | 
 workzone                | boolean                |           |          | 
 state                   | text                   |           |          | 
 vehicletype             | text                   |           |          | 
 year                    | smallint               |           |          | 
 make                    | text                   |           |          | 
 model                   | text                   |           |          | 
 color                   | text                   |           |          | 
 violation_type          | text                   |           |          | 
 charge                  | text                   |           |          | 
 article                 | text                   |           |          | 
 contributed_to_accident | boolean                |           |          | 
 race                    | text                   |           |          | 
 gender                  | text                   |           |          | 
 driver_city             | text                   |           |          | 
 driver_state            | text                   |           |          | 
 dl_state                | text                   |           |          | 
 arrest_type             | text                   |           |          | 
 geolocation             | point                  |           |          | 
 council_districts       | smallint               |           |          | 
 councils                | smallint               |           |          | 
 communities             | smallint               |           |          | 
 zip_codes               | smallint               |           |          | 
 municipalities          | smallint               |           |          | 

A good candidate would be the councils column. When we setup the hash partitioning correctly the result would be 5 partitions:

postgres=# select distinct councils from mv_traffic_violations ;
 councils 
----------
         
        1
        3
        5
        2
        4
(6 rows)

The partitioned table becomes:

create table traffic_violations_p_hash
( seqid text
, date_of_stop date  
, time_of_stop time
, agency text
, subagency text
, description text 
, location text
, latitude numeric
, longitude numeric
, accident text
, belts boolean
, personal_injury boolean 
, property_damage boolean
, fatal boolean
, commercial_license boolean
, hazmat boolean
, commercial_vehicle boolean
, alcohol boolean
, workzone boolean
, state text
, vehicletype text 
, year smallint
, make text
, model text
, color text
, violation_type text 
, charge text
, article text
, contributed_to_accident boolean
, race text
, gender text
, driver_city text
, driver_state text
, dl_state text
, arrest_type text
, geolocation point
, council_districts smallint 
, councils smallint
, communities smallint
, zip_codes smallint
, municipalities smallint
)
partition by hash (councils);

As usual psql will report the partitioning strategy for the table as well:

postgres=# \d traffic_violations_p_hash 
               Partitioned table "public.traffic_violations_p_hash"
         Column          |          Type          | Collation | Nullable | Default 
-------------------------+------------------------+-----------+----------+---------
 seqid                   | text                   |           |          | 
 date_of_stop            | date                   |           |          | 
 time_of_stop            | time without time zone |           |          | 
 agency                  | text                   |           |          | 
 subagency               | text                   |           |          | 
 description             | text                   |           |          | 
 location                | text                   |           |          | 
 latitude                | numeric                |           |          | 
 longitude               | numeric                |           |          | 
 accident                | text                   |           |          | 
 belts                   | boolean                |           |          | 
 personal_injury         | boolean                |           |          | 
 property_damage         | boolean                |           |          | 
 fatal                   | boolean                |           |          | 
 commercial_license      | boolean                |           |          | 
 hazmat                  | boolean                |           |          | 
 commercial_vehicle      | boolean                |           |          | 
 alcohol                 | boolean                |           |          | 
 workzone                | boolean                |           |          | 
 state                   | text                   |           |          | 
 vehicletype             | text                   |           |          | 
 year                    | smallint               |           |          | 
 make                    | text                   |           |          | 
 model                   | text                   |           |          | 
 color                   | text                   |           |          | 
 violation_type          | text                   |           |          | 
 charge                  | text                   |           |          | 
 article                 | text                   |           |          | 
 contributed_to_accident | boolean                |           |          | 
 race                    | text                   |           |          | 
 gender                  | text                   |           |          | 
 driver_city             | text                   |           |          | 
 driver_state            | text                   |           |          | 
 dl_state                | text                   |           |          | 
 arrest_type             | text                   |           |          | 
 geolocation             | point                  |           |          | 
 council_districts       | smallint               |           |          | 
 councils                | smallint               |           |          | 
 communities             | smallint               |           |          | 
 zip_codes               | smallint               |           |          | 
 municipalities          | smallint               |           |          | 
Partition key: HASH (councils)
Number of partitions: 0

The setup for the hash partitions will be as follows:

create table traffic_violations_p_hash_p1
partition of traffic_violations_p_hash
for values with (modulus 5, remainder 0);

create table traffic_violations_p_hash_p2
partition of traffic_violations_p_hash
for values with (modulus 5, remainder 1);

create table traffic_violations_p_hash_p3
partition of traffic_violations_p_hash
for values with (modulus 5, remainder 2);

create table traffic_violations_p_hash_p4
partition of traffic_violations_p_hash
for values with (modulus 5, remainder 3);

create table traffic_violations_p_hash_p5
partition of traffic_violations_p_hash
for values with (modulus 5, remainder 4);

Hash partitioning can not have a default partition as that would not make any sense because of the modulus and the remainder. When you try to do that you will get an error:

postgres=# create table traffic_violations_p_hash_default
postgres-# partition of traffic_violations_p_hash default;
psql: ERROR:  a hash-partitioned table may not have a default partition

The final setup of our hash partitioned table is:

postgres=# \d+ traffic_violations_p_hash
                                   Partitioned table "public.traffic_violations_p_hash"
         Column          |          Type          | Collation | Nullable | Default | Storage  | Stats target | Description 
-------------------------+------------------------+-----------+----------+---------+----------+--------------+-------------
 seqid                   | text                   |           |          |         | extended |              | 
 date_of_stop            | date                   |           |          |         | plain    |              | 
 time_of_stop            | time without time zone |           |          |         | plain    |              | 
 agency                  | text                   |           |          |         | extended |              | 
 subagency               | text                   |           |          |         | extended |              | 
 description             | text                   |           |          |         | extended |              | 
 location                | text                   |           |          |         | extended |              | 
 latitude                | numeric                |           |          |         | main     |              | 
 longitude               | numeric                |           |          |         | main     |              | 
 accident                | text                   |           |          |         | extended |              | 
 belts                   | boolean                |           |          |         | plain    |              | 
 personal_injury         | boolean                |           |          |         | plain    |              | 
 property_damage         | boolean                |           |          |         | plain    |              | 
 fatal                   | boolean                |           |          |         | plain    |              | 
 commercial_license      | boolean                |           |          |         | plain    |              | 
 hazmat                  | boolean                |           |          |         | plain    |              | 
 commercial_vehicle      | boolean                |           |          |         | plain    |              | 
 alcohol                 | boolean                |           |          |         | plain    |              | 
 workzone                | boolean                |           |          |         | plain    |              | 
 state                   | text                   |           |          |         | extended |              | 
 vehicletype             | text                   |           |          |         | extended |              | 
 year                    | smallint               |           |          |         | plain    |              | 
 make                    | text                   |           |          |         | extended |              | 
 model                   | text                   |           |          |         | extended |              | 
 color                   | text                   |           |          |         | extended |              | 
 violation_type          | text                   |           |          |         | extended |              | 
 charge                  | text                   |           |          |         | extended |              | 
 article                 | text                   |           |          |         | extended |              | 
 contributed_to_accident | boolean                |           |          |         | plain    |              | 
 race                    | text                   |           |          |         | extended |              | 
 gender                  | text                   |           |          |         | extended |              | 
 driver_city             | text                   |           |          |         | extended |              | 
 driver_state            | text                   |           |          |         | extended |              | 
 dl_state                | text                   |           |          |         | extended |              | 
 arrest_type             | text                   |           |          |         | extended |              | 
 geolocation             | point                  |           |          |         | plain    |              | 
 council_districts       | smallint               |           |          |         | plain    |              | 
 councils                | smallint               |           |          |         | plain    |              | 
 communities             | smallint               |           |          |         | plain    |              | 
 zip_codes               | smallint               |           |          |         | plain    |              | 
 municipalities          | smallint               |           |          |         | plain    |              | 
Partition key: HASH (councils)
Partitions: traffic_violations_p_hash_p1 FOR VALUES WITH (modulus 5, remainder 0),
            traffic_violations_p_hash_p2 FOR VALUES WITH (modulus 5, remainder 1),
            traffic_violations_p_hash_p3 FOR VALUES WITH (modulus 5, remainder 2),
            traffic_violations_p_hash_p4 FOR VALUES WITH (modulus 5, remainder 3),
            traffic_violations_p_hash_p5 FOR VALUES WITH (modulus 5, remainder 4)

Loading the data and checking the partitions:

postgres=# insert into traffic_violations_p_hash
postgres-#        select * from mv_traffic_violations;
INSERT 0 1528078
postgres=# select count(*) from traffic_violations_p_hash_p1;
 count  
--------
 988085
(1 row)

postgres=# select count(*) from traffic_violations_p_hash_p2;
 count  
--------
 539993
(1 row)

postgres=# select count(*) from traffic_violations_p_hash_p3;
 count 
-------
     0
(1 row)

postgres=# select count(*) from traffic_violations_p_hash_p4;
 count 
-------
     0
(1 row)

postgres=# select count(*) from traffic_violations_p_hash_p5;
 count 
-------
     0
(1 row)

Why do we have only data in the first and second partition? Looking more closely at the distribution of the councils in the materialized view:

postgres=# select distinct councils, count(*) from mv_traffic_violations group by councils;
 councils | count  
----------+--------
        1 | 231070
        2 | 211759
        3 | 328234
        4 | 334142
        5 | 308402
          | 114471
(6 rows)

First of all we have a lot of rows which do not have a council. Null values in hash partitioned tables always go to the partition where the remainder is 0:

postgres=# truncate table traffic_violations_p_hash;
TRUNCATE TABLE
postgres=# insert into traffic_violations_p_hash (councils) select null from generate_series(1,100);
INSERT 0 100
postgres=# select count(*) from traffic_violations_p_hash_p1;
 count 
-------
   100
(1 row)

Here is the reason for this behavior: “The calculated hash value for the null value will be zero, therefore, it will fall to the partition having remainder zero.”.

This does still not explain why nothing is in the partitions 3 to 5. Lets do a few tests with our councils:

postgres=# truncate traffic_violations_p_hash;
TRUNCATE TABLE
postgres=# insert into traffic_violations_p_hash (councils) values (1),(2),(3),(4),(5);
INSERT 0 5
postgres=# select councils from traffic_violations_p_hash_p1;
 councils 
----------
        1
        4
        5
(3 rows)

postgres=# select councils from traffic_violations_p_hash_p2;
 councils 
----------
        2
        3
(2 rows)

This confirms the behavior. Councils 1,4 and 5 go to the first partition while 2 and 3 go to the second one. The decision to hash partition the on the councils column clearly was not a good idea. Hash paritioning is great when you have many different values:

postgres=# truncate traffic_violations_p_hash;
TRUNCATE TABLE
postgres=# insert into traffic_violations_p_hash (councils) select * from generate_series(1,10000);
INSERT 0 10000
postgres=# select count(*) from traffic_violations_p_hash_p1;
 count 
-------
  1969
(1 row)

postgres=# select count(*) from traffic_violations_p_hash_p2;
 count 
-------
  2034
(1 row)

postgres=# select count(*) from traffic_violations_p_hash_p3;
 count 
-------
  2058
(1 row)

postgres=# select count(*) from traffic_violations_p_hash_p4;
 count 
-------
  1928
(1 row)

postgres=# select count(*) from traffic_violations_p_hash_p5;
 count 
-------
  2011
(1 row)

This gives much better distribution of the data. A good candidate would be the seqid column:

drop table traffic_violations_p_hash, traffic_violations_p_hash_p1, traffic_violations_p_hash_p2, traffic_violations_p_hash_p3, traffic_violations_p_hash_p4, traffic_violations_p_hash_p5;
create table traffic_violations_p_hash
( seqid text
, date_of_stop date  
, time_of_stop time
, agency text
, subagency text
, description text 
, location text
, latitude numeric
, longitude numeric
, accident text
, belts boolean
, personal_injury boolean 
, property_damage boolean
, fatal boolean
, commercial_license boolean
, hazmat boolean
, commercial_vehicle boolean
, alcohol boolean
, workzone boolean
, state text
, vehicletype text 
, year smallint
, make text
, model text
, color text
, violation_type text 
, charge text
, article text
, contributed_to_accident boolean
, race text
, gender text
, driver_city text
, driver_state text
, dl_state text
, arrest_type text
, geolocation point
, council_districts smallint 
, councils smallint
, communities smallint
, zip_codes smallint
, municipalities smallint
)
partition by hash (seqid);

create table traffic_violations_p_hash_p1
partition of traffic_violations_p_hash
for values with (modulus 5, remainder 0);

create table traffic_violations_p_hash_p2
partition of traffic_violations_p_hash
for values with (modulus 5, remainder 1);

create table traffic_violations_p_hash_p3
partition of traffic_violations_p_hash
for values with (modulus 5, remainder 2);

create table traffic_violations_p_hash_p4
partition of traffic_violations_p_hash
for values with (modulus 5, remainder 3);

create table traffic_violations_p_hash_p5
partition of traffic_violations_p_hash
for values with (modulus 5, remainder 4);

Loading the data again and checking the partitions:

postgres=# insert into traffic_violations_p_hash
postgres-#        select * from mv_traffic_violations;
INSERT 0 1528078
postgres=# select count(*) from traffic_violations_p_hash_p1;
 count  
--------
 305253
(1 row)

postgres=# select count(*) from traffic_violations_p_hash_p2;
 count  
--------
 304999
(1 row)

postgres=# select count(*) from traffic_violations_p_hash_p3;
 count  
--------
 305215
(1 row)

postgres=# select count(*) from traffic_violations_p_hash_p4;
 count  
--------
 305719
(1 row)

postgres=# select count(*) from traffic_violations_p_hash_p5;
 count  
--------
 306892
(1 row)

That was hash partitioning. In the next post we’ll look at partition pruning.

Cet article PostgreSQL partitioning (4): Hash partitioning est apparu en premier sur Blog dbi services.

Troubleshooting By Tracing GoldenGate Sessions

VitalSoftTech - Mon, 2019-06-03 09:56
Find out how troubleshoot by tracing GoldenGate DB sessions or the GoldenGate process from GGSCI and the process parameter file.
Categories: DBA Blogs

Partner Webcast – Oracle CASB - Enabling Security Monitoring for Oracle Cloud Infrastructure

As customers adopt cloud-based infrastructure as part of their digital journey, protecting this infrastructure becomes a critical security imperative to ensure that applications that are built on top...

We share our skills to maximize your revenue!
Categories: DBA Blogs

PostgreSQL partitioning (3): List partitioning

Yann Neuhaus - Mon, 2019-06-03 09:12

In the last posts of this series we prepared the data set and had a look at range partitioning. In this post we look at another partitioning strategy: List partitioning. Instead of partitioning by a range (typically based on day, year, month) list partitioning is used to partition on an explicit list with key values that define the partitions.

Coming back to the materialized view which holds our initial data set:

postgres=# \d mv_traffic_violations 
                 Materialized view "public.mv_traffic_violations"
         Column          |          Type          | Collation | Nullable | Default 
-------------------------+------------------------+-----------+----------+---------
 seqid                   | text                   |           |          | 
 date_of_stop            | date                   |           |          | 
 time_of_stop            | time without time zone |           |          | 
 agency                  | text                   |           |          | 
 subagency               | text                   |           |          | 
 description             | text                   |           |          | 
 location                | text                   |           |          | 
 latitude                | numeric                |           |          | 
 longitude               | numeric                |           |          | 
 accident                | text                   |           |          | 
 belts                   | boolean                |           |          | 
 personal_injury         | boolean                |           |          | 
 property_damage         | boolean                |           |          | 
 fatal                   | boolean                |           |          | 
 commercial_license      | boolean                |           |          | 
 hazmat                  | boolean                |           |          | 
 commercial_vehicle      | boolean                |           |          | 
 alcohol                 | boolean                |           |          | 
 workzone                | boolean                |           |          | 
 state                   | text                   |           |          | 
 vehicletype             | text                   |           |          | 
 year                    | smallint               |           |          | 
 make                    | text                   |           |          | 
 model                   | text                   |           |          | 
 color                   | text                   |           |          | 
 violation_type          | text                   |           |          | 
 charge                  | text                   |           |          | 
 article                 | text                   |           |          | 
 contributed_to_accident | boolean                |           |          | 
 race                    | text                   |           |          | 
 gender                  | text                   |           |          | 
 driver_city             | text                   |           |          | 
 driver_state            | text                   |           |          | 
 dl_state                | text                   |           |          | 
 arrest_type             | text                   |           |          | 
 geolocation             | point                  |           |          | 
 council_districts       | smallint               |           |          | 
 councils                | smallint               |           |          | 
 communities             | smallint               |           |          | 
 zip_codes               | smallint               |           |          | 
 municipalities          | smallint               |           |          | 

The application might want to ask for all traffic violations that match a specific violation type:

postgres=# select distinct violation_type from mv_traffic_violations;
 violation_type 
----------------
 Warning
 SERO
 Citation
 ESERO
(4 rows)

Partitioning by that would result in 4 partitions (plus one default partition for any other potential violation type in the future). The definition of the table and it’s partitions is pretty much the same as for range partitioning but this time we go for list partitioning:

create table traffic_violations_p_list
( seqid text
, date_of_stop date  
, time_of_stop time
, agency text
, subagency text
, description text 
, location text
, latitude numeric
, longitude numeric
, accident text
, belts boolean
, personal_injury boolean 
, property_damage boolean
, fatal boolean
, commercial_license boolean
, hazmat boolean
, commercial_vehicle boolean
, alcohol boolean
, workzone boolean
, state text
, vehicletype text 
, year smallint
, make text
, model text
, color text
, violation_type text 
, charge text
, article text
, contributed_to_accident boolean
, race text
, gender text
, driver_city text
, driver_state text
, dl_state text
, arrest_type text
, geolocation point
, council_districts smallint 
, councils smallint
, communities smallint
, zip_codes smallint
, municipalities smallint
)
partition by list (violation_type);

psql will show the list partitioning strategy as well:

postgres=# \d traffic_violations_p_list
               Partitioned table "public.traffic_violations_p_list"
         Column          |          Type          | Collation | Nullable | Default 
-------------------------+------------------------+-----------+----------+---------
 seqid                   | text                   |           |          | 
 date_of_stop            | date                   |           |          | 
 time_of_stop            | time without time zone |           |          | 
 agency                  | text                   |           |          | 
 subagency               | text                   |           |          | 
 description             | text                   |           |          | 
 location                | text                   |           |          | 
 latitude                | numeric                |           |          | 
 longitude               | numeric                |           |          | 
 accident                | text                   |           |          | 
 belts                   | boolean                |           |          | 
 personal_injury         | boolean                |           |          | 
 property_damage         | boolean                |           |          | 
 fatal                   | boolean                |           |          | 
 commercial_license      | boolean                |           |          | 
 hazmat                  | boolean                |           |          | 
 commercial_vehicle      | boolean                |           |          | 
 alcohol                 | boolean                |           |          | 
 workzone                | boolean                |           |          | 
 state                   | text                   |           |          | 
 vehicletype             | text                   |           |          | 
 year                    | smallint               |           |          | 
 make                    | text                   |           |          | 
 model                   | text                   |           |          | 
 color                   | text                   |           |          | 
 violation_type          | text                   |           |          | 
 charge                  | text                   |           |          | 
 article                 | text                   |           |          | 
 contributed_to_accident | boolean                |           |          | 
 race                    | text                   |           |          | 
 gender                  | text                   |           |          | 
 driver_city             | text                   |           |          | 
 driver_state            | text                   |           |          | 
 dl_state                | text                   |           |          | 
 arrest_type             | text                   |           |          | 
 geolocation             | point                  |           |          | 
 council_districts       | smallint               |           |          | 
 councils                | smallint               |           |          | 
 communities             | smallint               |           |          | 
 zip_codes               | smallint               |           |          | 
 municipalities          | smallint               |           |          | 
Partition key: LIST (violation_type)
Number of partitions: 0

Now we can create the partitions:

create table traffic_violations_p_list_warning
partition of traffic_violations_p_list
for values in ('Warning');

create table traffic_violations_p_list_sero
partition of traffic_violations_p_list
for values in ('SERO');

create table traffic_violations_p_list_Citation
partition of traffic_violations_p_list
for values in ('Citation');

create table traffic_violations_p_list_ESERO
partition of traffic_violations_p_list
for values in ('ESERO');

create table traffic_violations_p_list_default
    partition of traffic_violations_p_list DEFAULT;

Again, psql will display the partitions nicely:

postgres=# \d+ traffic_violations_p_list
                                   Partitioned table "public.traffic_violations_p_list"
         Column          |          Type          | Collation | Nullable | Default | Storage  | Stats target | Description 
-------------------------+------------------------+-----------+----------+---------+----------+--------------+-------------
 seqid                   | text                   |           |          |         | extended |              | 
 date_of_stop            | date                   |           |          |         | plain    |              | 
 time_of_stop            | time without time zone |           |          |         | plain    |              | 
 agency                  | text                   |           |          |         | extended |              | 
 subagency               | text                   |           |          |         | extended |              | 
 description             | text                   |           |          |         | extended |              | 
 location                | text                   |           |          |         | extended |              | 
 latitude                | numeric                |           |          |         | main     |              | 
 longitude               | numeric                |           |          |         | main     |              | 
 accident                | text                   |           |          |         | extended |              | 
 belts                   | boolean                |           |          |         | plain    |              | 
 personal_injury         | boolean                |           |          |         | plain    |              | 
 property_damage         | boolean                |           |          |         | plain    |              | 
 fatal                   | boolean                |           |          |         | plain    |              | 
 commercial_license      | boolean                |           |          |         | plain    |              | 
 hazmat                  | boolean                |           |          |         | plain    |              | 
 commercial_vehicle      | boolean                |           |          |         | plain    |              | 
 alcohol                 | boolean                |           |          |         | plain    |              | 
 workzone                | boolean                |           |          |         | plain    |              | 
 state                   | text                   |           |          |         | extended |              | 
 vehicletype             | text                   |           |          |         | extended |              | 
 year                    | smallint               |           |          |         | plain    |              | 
 make                    | text                   |           |          |         | extended |              | 
 model                   | text                   |           |          |         | extended |              | 
 color                   | text                   |           |          |         | extended |              | 
 violation_type          | text                   |           |          |         | extended |              | 
 charge                  | text                   |           |          |         | extended |              | 
 article                 | text                   |           |          |         | extended |              | 
 contributed_to_accident | boolean                |           |          |         | plain    |              | 
 race                    | text                   |           |          |         | extended |              | 
 gender                  | text                   |           |          |         | extended |              | 
 driver_city             | text                   |           |          |         | extended |              | 
 driver_state            | text                   |           |          |         | extended |              | 
 dl_state                | text                   |           |          |         | extended |              | 
 arrest_type             | text                   |           |          |         | extended |              | 
 geolocation             | point                  |           |          |         | plain    |              | 
 council_districts       | smallint               |           |          |         | plain    |              | 
 councils                | smallint               |           |          |         | plain    |              | 
 communities             | smallint               |           |          |         | plain    |              | 
 zip_codes               | smallint               |           |          |         | plain    |              | 
 municipalities          | smallint               |           |          |         | plain    |              | 
Partition key: LIST (violation_type)
Partitions: traffic_violations_p_list_citation FOR VALUES IN ('Citation'),
            traffic_violations_p_list_esero FOR VALUES IN ('ESERO'),
            traffic_violations_p_list_sero FOR VALUES IN ('SERO'),
            traffic_violations_p_list_warning FOR VALUES IN ('Warning'),
            traffic_violations_p_list_default DEFAULT

Loading the data and checking the amount of data for each partition:

postgres=# insert into traffic_violations_p_list
postgres-#        select * from mv_traffic_violations;
INSERT 0 1528078
postgres=# select count(*) from traffic_violations_p_list_warning;
 count  
--------
 759747
(1 row)

postgres=# select count(*) from traffic_violations_p_list_sero;
 count 
-------
   899
(1 row)

postgres=# select count(*) from traffic_violations_p_list_Citation;
 count  
--------
 693226
(1 row)

postgres=# select count(*) from traffic_violations_p_list_ESERO;
 count 
-------
 74206
(1 row)

postgres=# select count(*) from traffic_violations_p_list_default;
 count 
-------
     0
(1 row)

Done. This is list partitioning. Instead of partitioning by a range, list partitioning is for partitioning by key values. In the next post we will have a look at hash partitioning.

Cet article PostgreSQL partitioning (3): List partitioning est apparu en premier sur Blog dbi services.

So long …

Jonathan Lewis - Mon, 2019-06-03 08:09

… and thanks for any Douglas Adams references.

I decided on Friday that after more than 31 years using the Oracle software and more than 37 years of being self-employed in the computer industry – not to mention a further 8 years being an enthusiastic amateur – the time had finally come for a change of pace.  As from 1st June 2019 I am retired.

I have a couple of long term commitments that I’ll still be doing a little work for over the next few months, and I wouldn’t be surprised if someone occasionally tempts me into doing a few days solving interesting problems.  I’ll still be answering questions on the Oracle Developer Community forum and (if the email gets through) on the Oracle-L listserver , and I’ll still be writing blog notes about Oracle and submitting abstracts to conferences for a few years. Who knows, I may even get round to writing another book – after all, I’ve already downloaded Oracle 19 so I’ll still be experimenting, investigating, and trying to break things Some people retire to do the Times crossword or Sudoku, my retirement pastime is going to be playing with Oracle.

But now I’ve got 3 musical instruments to re-learn, a garden to re-engineer, apple trees to train, and a lot of rowing [U.S. “crew”] time to catch up on so that I can get fit enough to get back to playing squash (which might be an interesting experience now that my eyesight has been fixed). And for relaxation I’ll be sitting in the garden watching the birds, drinking fine wine and reading a good book.

Update (6th June 2019)

Thanks for all the appreciative comments to know. It’s nice to have a record of the fact that I have made a difference for so many people.

 

Ignoring Hints

Jonathan Lewis - Mon, 2019-06-03 02:36

One of the small changes (and, potentially big but temporary, threats) in 18.3 is the status of the “ignore hints” parameter. It ceases to be a hidden (underscore) parameter so you can now officially set parameter optimizer_ignore_hints to true in the parameter file, or at the system level, or at the session level. The threat, of course, it that some of your code may use the hidden version of the parameter (perhaps in an SQL_Patch as an opt_param() option rather than in its hint form) which no longer works after the upgrade.

But there’s more. The parameter (whether the old hidden version or the new revealed version) doesn’t make the optimizer ignore parallel() hints. But 18.3 now has a related parameter optimizer_ignore_parallel_hints to address this limitation. Here’s a quick demo – we start by creating a table and then running a query where the full tablescan is clearly the default strategy that the optimizer would take if we didn’t hint an indexed access path:

rem
rem     Script:         ignore_parallel_hints.sql
rem     Author:         Jonathan Lewis
rem     Dated:          May 2019
rem 

create table t1
as
with generator as (
        select 
                rownum id
        from dual 
        connect by 
                level <= 1e4 -- > comment to avoid WordPress format issue
)
select
        rownum                          id,
        mod(rownum,10)                  n1,
        lpad(rownum,10,'0')             v1,
        lpad('x',100,'x')               padding
from
        generator       v1,
        generator       v2
where
        rownum <= 1e4 -- > comment to avoid WordPress format issue
;

create index t1_i1 on t1(id);

set serveroutput off

prompt  =============
prompt  Baseline test
prompt  =============

select
        /*+ index(t1) */
        n1, sum(id)
from
        t1
where
        id > 0
group by
        n1
order by
        n1
;

select * from table(dbms_xplan.display_cursor);


SQL_ID  gudnnk7j7q5bz, child number 0
-------------------------------------
select  /*+ index(t1) */  n1, sum(id) from  t1 where  id > 0 group by
n1 order by  n1

Plan hash value: 356059923

----------------------------------------------------------------------------------------------
| Id  | Operation                            | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |       |       |       |   198 (100)|          |
|   1 |  SORT GROUP BY                       |       |    10 |    70 |   198   (2)| 00:00:01 |
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED| T1    | 10000 | 70000 |   196   (1)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN                  | T1_I1 | 10000 |       |    22   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("ID">0)


Now we repeat the exercise with the version-specific “alter session” command below – and you should try each option with each version of Oracle if you want to do the complete test cycle – to see that the session will ignore hints and the plan will change (side note – using the underscore version  with 18.3 doesn’t raise an error, the statement is silently ignored):


alter session set "_optimizer_ignore_hints" = true;
alter session set "optimizer_ignore_hints" = true;

SQL_ID  gudnnk7j7q5bz, child number 1
-------------------------------------
select  /*+ index(t1) */  n1, sum(id) from  t1 where  id > 0 group by
n1 order by  n1

Plan hash value: 3946799371

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |       |    27 (100)|          |
|   1 |  SORT GROUP BY     |      |    10 |    70 |    27  (12)| 00:00:01 |
|*  2 |   TABLE ACCESS FULL| T1   | 10000 | 70000 |    25   (4)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("ID">0)

Then, of course, we have to test a query with a parallel() hint – or shared() hint, which is the internal equivalent you will notice occasionally in outlines or the “remote” statement for distributed execution plans – to show that we don’t yet ignore parallel queries – the plans following the code are from 18.3:


select
        /*+ parallel(t1 3) */
        n1, sum(id)
from
        t1
where
        id > 0
group by
        n1
order by
        n1
;

select * from table(dbms_xplan.display_cursor);

alter session set "optimizer_ignore_parallel_hints" = true;

select
        /*+ parallel(t1 3) */
        n1, sum(id)
from
        t1
where
        id > 0
group by
        n1
order by
        n1
;

select * from table(dbms_xplan.display_cursor);




SQL_ID  7jynurdtc48kv, child number 0
-------------------------------------
select  /*+ parallel(t1 3) */  n1, sum(id) from  t1 where  id > 0 group
by  n1 order by  n1

Plan hash value: 2919148568

------------------------------------------------------------------------------------------------------------------
| Id  | Operation                | Name     | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |          |       |       |    10 (100)|          |        |      |            |
|   1 |  PX COORDINATOR          |          |       |       |            |          |        |      |            |
|   2 |   PX SEND QC (ORDER)     | :TQ10001 |    10 |    70 |    10  (10)| 00:00:01 |  Q1,01 | P->S | QC (ORDER) |
|   3 |    SORT GROUP BY         |          |    10 |    70 |    10  (10)| 00:00:01 |  Q1,01 | PCWP |            |
|   4 |     PX RECEIVE           |          |    10 |    70 |    10  (10)| 00:00:01 |  Q1,01 | PCWP |            |
|   5 |      PX SEND RANGE       | :TQ10000 |    10 |    70 |    10  (10)| 00:00:01 |  Q1,00 | P->P | RANGE      |
|   6 |       HASH GROUP BY      |          |    10 |    70 |    10  (10)| 00:00:01 |  Q1,00 | PCWP |            |
|   7 |        PX BLOCK ITERATOR |          | 10000 | 70000 |     9   (0)| 00:00:01 |  Q1,00 | PCWC |            |
|*  8 |         TABLE ACCESS FULL| T1       | 10000 | 70000 |     9   (0)| 00:00:01 |  Q1,00 | PCWP |            |
------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   8 - access(:Z>=:Z AND :Z<=:Z) -- > comment added to avoid wordpress format issue
       filter("ID">0)

Note
-----
   - Degree of Parallelism is 3 because of table property


Session altered.


SQL_ID  7jynurdtc48kv, child number 1
-------------------------------------
select  /*+ parallel(t1 3) */  n1, sum(id) from  t1 where  id > 0 group
by  n1 order by  n1

Plan hash value: 3946799371

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |       |    27 (100)|          |
|   1 |  SORT GROUP BY     |      |    10 |    70 |    27  (12)| 00:00:01 |
|*  2 |   TABLE ACCESS FULL| T1   | 10000 | 70000 |    25   (4)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("ID">0)


tl;dr

In 18.3 Oracle exposes the parameter optimizer_ignore_hints – any code using depending on the hidden version of this parameter will no longer behave as expected. 18.3 also introduces optimiser_ignore_parallel_hints to allow you to ignore parallel hints as well.

 

PostgreSQL partitioning (2): Range partitioning

Yann Neuhaus - Mon, 2019-06-03 01:00

Now that the data set is ready we will look at the first partitioning strategy: Range partitioning. Usually range partitioning is used to partition a table by days, months or years although you can partition by other data types as well. Time/date based range partitioning is probably the most common use case so we will partition our traffic violation data by year and later on sub partition that month.

Going back to the materialized we created in the first post, this is the structure:

postgres=# \d mv_traffic_violations 
                 Materialized view "public.mv_traffic_violations"
         Column          |          Type          | Collation | Nullable | Default 
-------------------------+------------------------+-----------+----------+---------
 seqid                   | text                   |           |          | 
 date_of_stop            | date                   |           |          | 
 time_of_stop            | time without time zone |           |          | 
 agency                  | text                   |           |          | 
 subagency               | text                   |           |          | 
 description             | text                   |           |          | 
 location                | text                   |           |          | 
 latitude                | numeric                |           |          | 
 longitude               | numeric                |           |          | 
 accident                | text                   |           |          | 
 belts                   | boolean                |           |          | 
 personal_injury         | boolean                |           |          | 
 property_damage         | boolean                |           |          | 
 fatal                   | boolean                |           |          | 
 commercial_license      | boolean                |           |          | 
 hazmat                  | boolean                |           |          | 
 commercial_vehicle      | boolean                |           |          | 
 alcohol                 | boolean                |           |          | 
 workzone                | boolean                |           |          | 
 state                   | text                   |           |          | 
 vehicletype             | text                   |           |          | 
 year                    | smallint               |           |          | 
 make                    | text                   |           |          | 
 model                   | text                   |           |          | 
 color                   | text                   |           |          | 
 violation_type          | text                   |           |          | 
 charge                  | text                   |           |          | 
 article                 | text                   |           |          | 
 contributed_to_accident | boolean                |           |          | 
 race                    | text                   |           |          | 
 gender                  | text                   |           |          | 
 driver_city             | text                   |           |          | 
 driver_state            | text                   |           |          | 
 dl_state                | text                   |           |          | 
 arrest_type             | text                   |           |          | 
 geolocation             | point                  |           |          | 
 council_districts       | smallint               |           |          | 
 councils                | smallint               |           |          | 
 communities             | smallint               |           |          | 
 zip_codes               | smallint               |           |          | 
 municipalities          | smallint               |           |          | 

Lets assume our most common queries against the data set restrict the data for one or more years. Using partitioned tables we can range partition on the “date_of_stop” column and break down the large data set into smaller chunks per year. How would that work in PostgreSQL?

Currently we have this min and max values for the “date_of_stop” column:

postgres=# select min(date_of_stop), max(date_of_stop) from mv_traffic_violations;
    min     |    max     
------------+------------
 2012-01-01 | 2019-06-01
(1 row)

If we want to partition that by year we need at least 8 partitions (2012 to 2019) and maybe already add a partition for 2010 so we are on the safe side when a more recent data set brings data for next year. The first step is to create a partitioned table:

create table traffic_violations_p
( seqid text
, date_of_stop date not null
, time_of_stop time
, agency text
, subagency text
, description text 
, location text
, latitude numeric
, longitude numeric
, accident text
, belts boolean
, personal_injury boolean 
, property_damage boolean
, fatal boolean
, commercial_license boolean
, hazmat boolean
, commercial_vehicle boolean
, alcohol boolean
, workzone boolean
, state text
, vehicletype text 
, year smallint
, make text
, model text
, color text
, violation_type text 
, charge text
, article text
, contributed_to_accident boolean
, race text
, gender text
, driver_city text
, driver_state text
, dl_state text
, arrest_type text
, geolocation point
, council_districts smallint 
, councils smallint
, communities smallint
, zip_codes smallint
, municipalities smallint)
partition by range (date_of_stop);

Now we have a partitioned table with no partitions:

postgres=# \d traffic_violations_p
                  Partitioned table "public.traffic_violations_p"
         Column          |          Type          | Collation | Nullable | Default 
-------------------------+------------------------+-----------+----------+---------
 seqid                   | text                   |           |          | 
 date_of_stop            | date                   |           | not null | 
 time_of_stop            | time without time zone |           |          | 
 agency                  | text                   |           |          | 
 subagency               | text                   |           |          | 
 description             | text                   |           |          | 
 location                | text                   |           |          | 
 latitude                | numeric                |           |          | 
 longitude               | numeric                |           |          | 
 accident                | text                   |           |          | 
 belts                   | boolean                |           |          | 
 personal_injury         | boolean                |           |          | 
 property_damage         | boolean                |           |          | 
 fatal                   | boolean                |           |          | 
 commercial_license      | boolean                |           |          | 
 hazmat                  | boolean                |           |          | 
 commercial_vehicle      | boolean                |           |          | 
 alcohol                 | boolean                |           |          | 
 workzone                | boolean                |           |          | 
 state                   | text                   |           |          | 
 vehicletype             | text                   |           |          | 
 year                    | smallint               |           |          | 
 make                    | text                   |           |          | 
 model                   | text                   |           |          | 
 color                   | text                   |           |          | 
 violation_type          | text                   |           |          | 
 charge                  | text                   |           |          | 
 article                 | text                   |           |          | 
 contributed_to_accident | boolean                |           |          | 
 race                    | text                   |           |          | 
 gender                  | text                   |           |          | 
 driver_city             | text                   |           |          | 
 driver_state            | text                   |           |          | 
 dl_state                | text                   |           |          | 
 arrest_type             | text                   |           |          | 
 geolocation             | point                  |           |          | 
 council_districts       | smallint               |           |          | 
 councils                | smallint               |           |          | 
 communities             | smallint               |           |          | 
 zip_codes               | smallint               |           |          | 
 municipalities          | smallint               |           |          | 
Partition key: RANGE (date_of_stop)
Number of partitions: 0

If we try to insert data without any partitions this will obviously fail:

postgres=# insert into traffic_violations_p (date_of_Stop) values ( now() );
psql: ERROR:  no partition of relation "traffic_violations_p" found for row
DETAIL:  Partition key of the failing row contains (date_of_stop) = (2019-05-28).

We need partitions and this is how you can add partitions to that partitioned table:

create table traffic_violations_p_2012
partition of traffic_violations_p
for values from ('2012-01-01') to ('2012-12-31');

create table traffic_violations_p_2013
partition of traffic_violations_p
for values from ('2013-01-01') to ('2013-12-31');

create table traffic_violations_p_2014
partition of traffic_violations_p
for values from ('2014-01-01') to ('2014-12-31');

create table traffic_violations_p_2015
partition of traffic_violations_p
for values from ('2015-01-01') to ('2015-12-31');

create table traffic_violations_p_2016
partition of traffic_violations_p
for values from ('2016-01-01') to ('2016-12-31');

create table traffic_violations_p_2017
partition of traffic_violations_p
for values from ('2017-01-01') to ('2017-12-31');

create table traffic_violations_p_2018
partition of traffic_violations_p
for values from ('2018-01-01') to ('2018-12-31');

create table traffic_violations_p_2019
partition of traffic_violations_p
for values from ('2019-01-01') to ('2019-12-31');

create table traffic_violations_p_2020
partition of traffic_violations_p
for values from ('2020-01-01') to ('2020-12-31');

We now have a partition for each year from 2012 to 2020:

postgres=# \d+ traffic_violations_p
                                      Partitioned table "public.traffic_violations_p"
         Column          |          Type          | Collation | Nullable | Default | Storage  | Stats target | Description 
-------------------------+------------------------+-----------+----------+---------+----------+--------------+-------------
 seqid                   | text                   |           |          |         | extended |              | 
 date_of_stop            | date                   |           |          |         | plain    |              | 
 time_of_stop            | time without time zone |           |          |         | plain    |              | 
 agency                  | text                   |           |          |         | extended |              | 
 subagency               | text                   |           |          |         | extended |              | 
 description             | text                   |           |          |         | extended |              | 
 location                | text                   |           |          |         | extended |              | 
 latitude                | numeric                |           |          |         | main     |              | 
 longitude               | numeric                |           |          |         | main     |              | 
 accident                | text                   |           |          |         | extended |              | 
 belts                   | boolean                |           |          |         | plain    |              | 
 personal_injury         | boolean                |           |          |         | plain    |              | 
 property_damage         | boolean                |           |          |         | plain    |              | 
 fatal                   | boolean                |           |          |         | plain    |              | 
 commercial_license      | boolean                |           |          |         | plain    |              | 
 hazmat                  | boolean                |           |          |         | plain    |              | 
 commercial_vehicle      | boolean                |           |          |         | plain    |              | 
 alcohol                 | boolean                |           |          |         | plain    |              | 
 workzone                | boolean                |           |          |         | plain    |              | 
 state                   | text                   |           |          |         | extended |              | 
 vehicletype             | text                   |           |          |         | extended |              | 
 year                    | smallint               |           |          |         | plain    |              | 
 make                    | text                   |           |          |         | extended |              | 
 model                   | text                   |           |          |         | extended |              | 
 color                   | text                   |           |          |         | extended |              | 
 violation_type          | text                   |           |          |         | extended |              | 
 charge                  | text                   |           |          |         | extended |              | 
 article                 | text                   |           |          |         | extended |              | 
 contributed_to_accident | boolean                |           |          |         | plain    |              | 
 race                    | text                   |           |          |         | extended |              | 
 gender                  | text                   |           |          |         | extended |              | 
 driver_city             | text                   |           |          |         | extended |              | 
 driver_state            | text                   |           |          |         | extended |              | 
 dl_state                | text                   |           |          |         | extended |              | 
 arrest_type             | text                   |           |          |         | extended |              | 
 geolocation             | point                  |           |          |         | plain    |              | 
 council_districts       | smallint               |           |          |         | plain    |              | 
 councils                | smallint               |           |          |         | plain    |              | 
 communities             | smallint               |           |          |         | plain    |              | 
 zip_codes               | smallint               |           |          |         | plain    |              | 
 municipalities          | smallint               |           |          |         | plain    |              | 
Partition key: RANGE (date_of_stop)
Partitions: traffic_violations_p_2012 FOR VALUES FROM ('2012-01-01') TO ('2012-12-31'),
            traffic_violations_p_2013 FOR VALUES FROM ('2013-01-01') TO ('2013-12-31'),
            traffic_violations_p_2014 FOR VALUES FROM ('2014-01-01') TO ('2014-12-31'),
            traffic_violations_p_2015 FOR VALUES FROM ('2015-01-01') TO ('2015-12-31'),
            traffic_violations_p_2016 FOR VALUES FROM ('2016-01-01') TO ('2016-12-31'),
            traffic_violations_p_2017 FOR VALUES FROM ('2017-01-01') TO ('2017-12-31'),
            traffic_violations_p_2018 FOR VALUES FROM ('2018-01-01') TO ('2018-12-31'),
            traffic_violations_p_2019 FOR VALUES FROM ('2019-01-01') TO ('2019-12-31'),
            traffic_violations_p_2020 FOR VALUES FROM ('2020-01-01') TO ('2020-12-31')

Doing the same insert as before now succeeds because we have a partition where the row can go to:

postgres=# insert into traffic_violations_p (date_of_Stop) values ( now() );
INSERT 0 1
postgres=# delete from traffic_violations_p;
DELETE 1
postgres=# 

What still would fail is an insert with a date before 2012 or after 2020:

postgres=# insert into traffic_violations_p (date_of_Stop) values ( now() + interval '2 years' );
psql: ERROR:  no partition of relation "traffic_violations_p" found for row
DETAIL:  Partition key of the failing row contains (date_of_stop) = (2021-05-28).

To avoid that we can create a “default” partition:

CREATE TABLE traffic_violations_p_default
    PARTITION OF traffic_violations_p DEFAULT;

A default partition will hold all the rows that do not match any of the existing partition definitions:

postgres=# select (date_of_stop) from traffic_violations_p_default;
 date_of_stop 
--------------
 2021-05-28
(1 row)
postgres=# delete from traffic_violations_p;
DELETE 1

As our partitioned table setup is now complete we can load the data:

postgres=# insert into traffic_violations_p select * from mv_traffic_violations;
INSERT 0 1528078

All rows successfully loaded so we can check the counts for each partition:

postgres=# select count(*) from traffic_violations_p_2012;
 count  
--------
 150989
(1 row)

postgres=# select count(*) from traffic_violations_p_2013;
 count  
--------
 189946
(1 row)

postgres=# select count(*) from traffic_violations_p_2014;
 count  
--------
 223248
(1 row)

postgres=# select count(*) from traffic_violations_p_2015;
 count  
--------
 234468
(1 row)

postgres=# select count(*) from traffic_violations_p_2016;
 count  
--------
 219227
(1 row)

postgres=# select count(*) from traffic_violations_p_2017;
 count  
--------
 198879
(1 row)

postgres=# select count(*) from traffic_violations_p_2018;
 count  
--------
 217911
(1 row)

postgres=# select count(*) from traffic_violations_p_2019;
 count 
-------
 89823
(1 row)

postgres=# select count(*) from traffic_violations_p_2020;
 count 
-------
     0
(1 row)

postgres=# select count(*) from traffic_violations_p_default;
 count 
-------
  3588
(1 row)

Why do we have so many rows in the default partition? We checked the min and max values and there should be no row before 2012 and after 2019. But:

postgres=# select distinct date_of_stop from traffic_violations_p_default order by 1;
 date_of_stop 
--------------
 2012-12-31
 2013-12-31
 2014-12-31
 2015-12-31
 2016-12-31
 2017-12-31
 2018-12-31
(7 rows)

The issue it, that the upper bound is exclusive so we did not correctly setup the partitions. The correct way of doing it is:

drop table traffic_violations_p_2012, traffic_violations_p_2013, traffic_violations_p_2014, traffic_violations_p_2015, traffic_violations_p_2016, traffic_violations_p_2017, traffic_violations_p_2018, traffic_violations_p_2019, traffic_violations_p_2020;

create table traffic_violations_p_2012
partition of traffic_violations_p
for values from ('2012-01-01') to ('2013-01-01');

create table traffic_violations_p_2013
partition of traffic_violations_p
for values from ('2013-01-01') to ('2014-01-01');

create table traffic_violations_p_2014
partition of traffic_violations_p
for values from ('2014-01-01') to ('2015-01-01');

create table traffic_violations_p_2015
partition of traffic_violations_p
for values from ('2015-01-01') to ('2016-01-01');

create table traffic_violations_p_2016
partition of traffic_violations_p
for values from ('2016-01-01') to ('2017-01-01');

create table traffic_violations_p_2017
partition of traffic_violations_p
for values from ('2017-01-01') to ('2018-01-01');

create table traffic_violations_p_2018
partition of traffic_violations_p
for values from ('2018-01-01') to ('2019-01-01');

create table traffic_violations_p_2019
partition of traffic_violations_p
for values from ('2019-01-01') to ('2020-01-01');

create table traffic_violations_p_2020
partition of traffic_violations_p
for values from ('2020-01-01') to ('2021-01-01');

delete from traffic_violations_p;

Once the partitions are set up as required the data is correctly stored:

postgres=# insert into traffic_violations_p select * from mv_traffic_violations;
INSERT 0 1528078
postgres=# select count(*) from traffic_violations_p_default ;
 count 
-------
     0
(1 row)

Instead of having one big table containing all the data we now have 10 partitions splitting the table in smaller chunks. Why does that matter? This is the topic of another post after we had a look at list and hash partitioning in the next posts.

Cet article PostgreSQL partitioning (2): Range partitioning est apparu en premier sur Blog dbi services.

PostgreSQL partitioning (1): Preparing the data set

Yann Neuhaus - Sun, 2019-06-02 08:20

This is the start of a series about partitioning in PostgreSQL. For this and the following posts I will use PostgreSQL 12 (which currently is in beta) so some stuff might not work if you are on PostgreSQL 11 or even on PostgreSQL 10 when declarative partitioning was introduced. This very first post is only about preparing the data set and I decided to separate this one from the next post because it shows various things around loading data in PostgreSQL and therefore might be useful on its one.

When you are looking for free data sets “The home of the U.S. Government’s open data” is a great source. It provides free data sets for research, development or just data you can play with for whatever reason. For the scope of this and the following posts we will use the Traffic violations data set. It contains more the 1.5 millions of rows and is a good starting point for diving into PostgreSQL partitioning.

The initial import goes into a table which only contains text columns and from there we can load our partitions later on. The structure of the table is:

create table traffic_violations
( seqid text
, date_of_stop text
, time_of_stop text
, agency text
, subagency text
, description text
, location text
, latitude text
, longitude text 
, accident text
, belts text
, personal_injury text
, property_damage text
, fatal text
, commercial_license text
, hazmat text
, commercial_vehicle text
, alcohol text
, workzone text
, state text
, vehicletype text
, year text
, make text
, model text
, color text
, violation_type text
, charge text
, article text
, contributed_to_accident text
, race text
, gender text
, driver_city text
, driver_state text
, dl_state text
, arrest_type text
, geolocation text
, council_districts text
, councils text
, communities text
, zip_codes text
, municipalities text
);

For loading the table the easiest way is to use PostgreSQL’s copy command:

postgres=# \! ls -l Traffic_Violations.csv
-rw-rw-r--. 1 postgres postgres 634161687 May 28 11:28 Traffic_Violations.csv
postgres=# copy traffic_violations from '/home/postgres/Traffic_Violations.csv' with ( format csv, header true, null 'null', delimiter ',');
COPY 1528078

So we have around 1.5 millions of rows with real data. As this table contains only text columns we will create a materialized view with data types that fit the data. Looking at one row of the table:

postgres=# \x
Expanded display is on.
postgres=# select * from traffic_violations limit 1;
-[ RECORD 1 ]-----------+-------------------------------------------------------------------------
seqid                   | e13d2082-55a7-4a93-8836-173be19d2648
date_of_stop            | 06/01/2019
time_of_stop            | 13:38:00
agency                  | MCP
subagency               | 2nd District, Bethesda
description             | DRIVER FAILURE TO YIELD RIGHT-OF-WAY TO VEH. UPON EMERGING FROM DRIVEWAY
location                | RANDOLPH RD / NEW HAMPSHIRE
latitude                | 39.07592
longitude               | -77.0011316666667
accident                | Yes
belts                   | No
personal_injury         | No
property_damage         | Yes
fatal                   | No
commercial_license      | No
hazmat                  | No
commercial_vehicle      | No
alcohol                 | No
workzone                | No
state                   | MD
vehicletype             | 02 - Automobile
year                    | 2014
make                    | HYUN
model                   | TK
color                   | SILVER
violation_type          | Warning
charge                  | 21-705(c)
article                 | Transportation Article
contributed_to_accident | false
race                    | WHITE
gender                  | F
driver_city             | SILVER SPRING
driver_state            | MD
dl_state                | MD
arrest_type             | A - Marked Patrol
geolocation             | (39.07592, -77.0011316666667)
council_districts       | 5
councils                | 5
communities             | 26
zip_codes               | 12
municipalities          | 1

The materialized view becomes:

create materialized view mv_traffic_violations
( seqid 
, date_of_stop  
, time_of_stop 
, agency  
, subagency  
, description 
, location 
, latitude 
, longitude 
, accident 
, belts 
, personal_injury 
, property_damage 
, fatal 
, commercial_license 
, hazmat 
, commercial_vehicle 
, alcohol 
, workzone 
, state 
, vehicletype 
, year 
, make 
, model 
, color 
, violation_type 
, charge 
, article 
, contributed_to_accident 
, race 
, gender 
, driver_city 
, driver_state 
, dl_state 
, arrest_type 
, geolocation 
, council_districts 
, councils 
, communities 
, zip_codes 
, municipalities 
)
as
select seqid
     , to_date(date_of_stop,'MM/DD/YYYY')
     , time_of_stop::time
     , agency
     , subagency 
     , description
     , location
     , latitude::numeric
     , longitude::numeric
     , accident
     , belts::boolean
     , personal_injury::boolean
     , property_damage::boolean
     , fatal::boolean
     , commercial_license::boolean
     , hazmat::boolean
     , commercial_vehicle::boolean
     , alcohol::boolean
     , workzone::boolean
     , state
     , vehicletype
     , case year
         when '' then null
         else year::smallint
       end
     , make
     , model
     , color
     , violation_type
     , charge
     , article
     , contributed_to_accident::boolean
     , race
     , gender
     , driver_city
     , driver_state
     , dl_state
     , arrest_type
     , geolocation::point
     , case council_districts
         when '' then null
         else council_districts::smallint
       end
     , case councils
         when '' then null
         else councils::smallint
       end
     , case communities
         when '' then null
         else communities::smallint
       end
     , case zip_codes
         when '' then null
         else zip_codes::smallint
       end
     , case municipalities
         when '' then null
         else municipalities::smallint
       end
  from traffic_violations;

The statement contains some “case” expressions to test for empty strings. If you do not do this you get issues like this when you cast into a specific data type:

psql: ERROR:  invalid input syntax for type smallint: ""

The beauty of a materialized view is, that you can refresh whenever the underlying data set changed, e.g.:

postgres=# refresh materialized view mv_traffic_violations WITH data;
REFRESH MATERIALIZED VIEW

Note that this will block access to the materialized view for the time of the refresh. You could avoid that with refreshing concurrently but that only works when you have a unique index on the materialized view. With this data set we can not create a unique index because some of the rows are duplicates, e.g.:

postgres=# select * from mv_traffic_violations where seqid='b87c908c-ce2d-4c10-89fa-ca48735af485' and date_of_stop = '2012-11-07' and time_of_stop = '05:07:00' and description = 'FAILURE OF VEH. DRIVER IN ACCIDENT TO LOCATE AND NOTIFY OWNER OF UNATTENDED VEH. OF DAMAGE' and charge = '20-105(b)' and violation_type = 'Citation';
-[ RECORD 1 ]-----------+-------------------------------------------------------------------------------------------
seqid                   | b87c908c-ce2d-4c10-89fa-ca48735af485
date_of_stop            | 2012-11-07
time_of_stop            | 05:07:00
agency                  | MCP
subagency               | 2nd District, Bethesda
description             | FAILURE OF VEH. DRIVER IN ACCIDENT TO LOCATE AND NOTIFY OWNER OF UNATTENDED VEH. OF DAMAGE
location                | IFO 2531 ROSS RD
latitude                | 39.0222578333333
longitude               | -77.04575825
accident                | Yes
belts                   | f
personal_injury         | f
property_damage         | t
fatal                   | f
commercial_license      | f
hazmat                  | f
commercial_vehicle      | f
alcohol                 | f
workzone                | f
state                   | MD
vehicletype             | 02 - Automobile
year                    | 2002
make                    | SUBARU
model                   | FORESTER
color                   | SILVER
violation_type          | Citation
charge                  | 20-105(b)
article                 | Transportation Article
contributed_to_accident | f
race                    | HISPANIC
gender                  | M
driver_city             | SILVER SPRING
driver_state            | MD
dl_state                | MD
arrest_type             | A - Marked Patrol
geolocation             | (39.022257833333,-77.04575825)
council_districts       | 5
councils                | 5
communities             | 10
zip_codes               | 17
municipalities          | 1
-[ RECORD 2 ]-----------+-------------------------------------------------------------------------------------------
seqid                   | b87c908c-ce2d-4c10-89fa-ca48735af485
date_of_stop            | 2012-11-07
time_of_stop            | 05:07:00
agency                  | MCP
subagency               | 2nd District, Bethesda
description             | FAILURE OF VEH. DRIVER IN ACCIDENT TO LOCATE AND NOTIFY OWNER OF UNATTENDED VEH. OF DAMAGE
location                | IFO 2531 ROSS RD
latitude                | 39.0222578333333
longitude               | -77.04575825
accident                | Yes
belts                   | f
personal_injury         | f
property_damage         | t
fatal                   | f
commercial_license      | f
hazmat                  | f
commercial_vehicle      | f
alcohol                 | f
workzone                | f
state                   | MD
vehicletype             | 02 - Automobile
year                    | 2002
make                    | SUBARU
model                   | FORESTER
color                   | SILVER
violation_type          | Citation
charge                  | 20-105(b)
article                 | Transportation Article
contributed_to_accident | f
race                    | HISPANIC
gender                  | M
driver_city             | SILVER SPRING
driver_state            | MD
dl_state                | MD
arrest_type             | A - Marked Patrol
geolocation             | (39.022257833333,-77.04575825)
council_districts       | 5
councils                | 5
communities             | 10
zip_codes               | 17
municipalities          | 1

So now we have the data set we can use for the partitioning examples in the next posts. To summarize, what did we do:

  • https://www.data.gov/ is a great source for free data sets and we used one of them to load data into PostgreSQL
  • copy is great for loading data into PostgreSQL
  • The “case” statement is great for testing various conditions in a statement
  • materialized view are great when you want to refresh from an underlying data set that changes, but you will need a unique index for refreshing concurrently
  • You might need to cast one data type into another

In the next post we will look at the first partitioning strategy: Range partitioning.

Cet article PostgreSQL partitioning (1): Preparing the data set est apparu en premier sur Blog dbi services.

PostgreSQL 12: New partition reporting functions

Yann Neuhaus - Sun, 2019-06-02 01:00

PostgreSQL 10 introduced declarative partitioning (with some limitations), PostgreSQL 11 improved that a lot (Updating the partition key now works in PostgreSQL 11, Insert…on conflict with partitions finally works in PostgreSQL 11, Local partitioned indexes in PostgreSQL 11, Hash Partitioning in PostgreSQL 11) and PostgreSQL 12 goes even further. Beside that foreign keys can now reference partitioned tables three new functions made it into PostgreSQL 12 that will give you information about your partitioned tables.

Our little demo setup consist of a list partitioned table with three partitions:

postgres=# create table animals ( id int generated always as identity ( cache 10 ),
postgres(#                        name text unique,
postgres(#                        primary key(id,name)
postgres(#                      ) 
postgres-#                      partition by list (name);
CREATE TABLE
postgres=# create table animals_elephants
postgres-#   partition of animals
postgres-#   for values in ('elephant');
CREATE TABLE
postgres=# create table animals_cats
postgres-#   partition of animals
postgres-#   for values in ('cats');
CREATE TABLE
postgres=# create table animals_dogs
postgres-#   partition of animals
postgres-#   for values in ('dogs');
CREATE TABLE

What already worked in PostgreSQL 11 is that psql will give you information about your partitions:

postgres=# \d animals
                   Partitioned table "public.animals"
 Column |  Type   | Collation | Nullable |           Default            
--------+---------+-----------+----------+------------------------------
 id     | integer |           | not null | generated always as identity
 name   | text    |           | not null | 
Partition key: LIST (name)
Indexes:
    "animals_pkey" PRIMARY KEY, btree (id, name)
    "animals_name_key" UNIQUE CONSTRAINT, btree (name)
Number of partitions: 3 (Use \d+ to list them.)

Using “\d+” will even show you more information:

postgres=# \d+ animals
                                       Partitioned table "public.animals"
 Column |  Type   | Collation | Nullable |           Default            | Storage  | Stats target | Description 
--------+---------+-----------+----------+------------------------------+----------+--------------+-------------
 id     | integer |           | not null | generated always as identity | plain    |              | 
 name   | text    |           | not null |                              | extended |              | 
Partition key: LIST (name)
Indexes:
    "animals_pkey" PRIMARY KEY, btree (id, name)
    "animals_name_key" UNIQUE CONSTRAINT, btree (name)
    "animals_i1" btree (name)
Partitions: animals_cats FOR VALUES IN ('cats'),
            animals_dogs FOR VALUES IN ('dogs'),
            animals_elephants FOR VALUES IN ('elephant')

Now with PostgreSQL 12 there are three new functions which you can use get information about your partitioned tables and the partitions itself. The first one will give you the partition tree:

postgres=# select pg_partition_tree('animals');
        pg_partition_tree        
---------------------------------
 (animals,,f,0)
 (animals_elephants,animals,t,1)
 (animals_dogs,animals,t,1)
 (animals_cats,animals,t,1)
(4 rows)

The second one will give you the ancestor relations of the given partition:

postgres=# select pg_partition_ancestors('animals_dogs');
 pg_partition_ancestors 
------------------------
 animals_dogs
 animals
(2 rows)

The third one will give you the root for a given partition:

postgres=# select pg_partition_root('animals_cats');
 pg_partition_root 
-------------------
 animals
(1 row)

All of them do also work for partitioned indexes:

postgres=# create index animals_i1 on animals (name);
CREATE INDEX
postgres=# select pg_partition_tree('animals_i1');
              pg_partition_tree              
---------------------------------------------
 (animals_i1,,f,0)
 (animals_cats_name_idx,animals_i1,t,1)
 (animals_dogs_name_idx,animals_i1,t,1)
 (animals_elephants_name_idx,animals_i1,t,1)
(4 rows)

Nice.

Cet article PostgreSQL 12: New partition reporting functions est apparu en premier sur Blog dbi services.

Ubuntu Server: eBook

Dietrich Schroff - Sat, 2019-06-01 09:31
For testing waagent (Microsoft Azure Linux Guest Agent) i downloaded Ubuntu Server. Ubuntu advertised an ebook for administrators:

Here the short summary from this webpage:

Server provisioning: what Network Admins and IT pros need to knowThis document is designed to help system administrators and DevOps focused organisations to understand bare metal server provisioning, understand its value proposition, and learn about how leading companies are using server provisioning solutions within their hyperscale environments.
Canonical’s MAAS helps organisations to take full advantage of existing hardware investments by maximising hardware efficiency, and a pathway to leverage the performance and security of hardware based solutions with the economics and efficiencies of the cloud.With MAAS = Metal As A Service

The eBook contains 6 chapters on 20 pages:
  1. Executive summary
  2. Cloud speed with bare metal reliability and efficiency 
  3. Get the most out of your hardware investment
  4. How the smartest IT Pros let software do the work
  5. Make hardware investments more strategic 
  6. Conclusion
Ubuntus MAAS is the idea to create your own cloud with your own metal. Its about the automation of installing and provisioning hardware. There are region controllers (regiond) and rack controllers (rackd) which will manage your systems by using DNS, DHCP, PXE, TFTP and some others...
They are introducing availibility zones like they exist in Amazon Web Services or Microsoft Azure:
 If you are interested take a look here:


Get started with MAAS
To download and install MAAS for free please visit
ubuntu.com/download/server/provisioning

WebLogic – Upgrade of Java

Yann Neuhaus - Sat, 2019-06-01 03:05

If you are used to Java, you probably know that there are really important security fixes published every quarter in the Java Critical Patch Update (CPU) and it’s really important to keep it up to date. It’s always easy to upgrade your Java installation, you just have to deploy the new version entirely. In most cases, it’s also pretty easy to update the Components that are using Java (E.g.: Application Servers like Apache Tomcat), since they are – most of the time – relying on an environment variable (JAVA_HOME) to know which Java should be used. On the other hand, there is the WebLogic case and that’s where all the “fun” is…

In a previous blog, I talked about the JAVA_HOME management inside the WebLogic Server files. If you want to upgrade the Java version that is used by your WebLogic Server, then you basically have three choices so let’s see that in detail:

 

I. Use a symlink for the JAVA_HOME

You might be tempted to do something like that:

[weblogic@weblogic_server_01 ~]$ cd /app/Java/
[weblogic@weblogic_server_01 Java]$
[weblogic@weblogic_server_01 Java]$ ls -l
total 0
drwxr-xr-x. 1 weblogic weblogic 17 May 18 13:55 jdk1.8.0_151
drwxr-xr-x. 1 weblogic weblogic 17 May 18 14:02 jdk1.8.0_192
[weblogic@weblogic_server_01 Java]$
[weblogic@weblogic_server_01 Java]$ ln -s jdk1.8.0_151/ jdk
[weblogic@weblogic_server_01 Java]$ ls -l
total 0
lrwxrwxrwx. 1 weblogic weblogic 13 May 18 14:05 jdk -> jdk1.8.0_151/
drwxr-xr-x. 1 weblogic weblogic 17 May 18 13:55 jdk1.8.0_151
drwxr-xr-x. 1 weblogic weblogic 17 May 18 14:02 jdk1.8.0_192
[weblogic@weblogic_server_01 Java]$
[weblogic@weblogic_server_01 Java]$ JAVA_HOME=/app/Java/jdk
[weblogic@weblogic_server_01 Java]$
[weblogic@weblogic_server_01 Java]$ $JAVA_HOME/bin/java -version
java version "1.8.0_151"
Java(TM) SE Runtime Environment (build 1.8.0_151-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.151-b12, mixed mode)
[weblogic@weblogic_server_01 Java]$
[weblogic@weblogic_server_01 Java]$ rm jdk
[weblogic@weblogic_server_01 Java]$ ln -s jdk1.8.0_192/ jdk
[weblogic@weblogic_server_01 Java]$ ls -l
total 0
lrwxrwxrwx. 1 weblogic weblogic 13 May 18 14:06 jdk -> jdk1.8.0_192/
drwxr-xr-x. 1 weblogic weblogic 17 May 18 13:55 jdk1.8.0_151
drwxr-xr-x. 1 weblogic weblogic 17 May 18 14:02 jdk1.8.0_192
[weblogic@weblogic_server_01 Java]$
[weblogic@weblogic_server_01 Java]$ $JAVA_HOME/bin/java -version
java version "1.8.0_192"
Java(TM) SE Runtime Environment (build 1.8.0_192-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.192-b12, mixed mode)
[weblogic@weblogic_server_01 Java]$

 

And then you would use “/app/Java/jdk” (a symlink) for WebLogic. Please don’t, it’s a terrible idea!

You might think that using a symlink is a good idea and I thought that too at the beginning… But Oracle officially doesn’t support it and worse than that, it will actually not stay. You can technically use a symlink for your WebLogic Server, it will run (even if it’s not officially supported) but as soon as you are going to run an installer/configurator, it will actually overwrite and replace the symlink path (“/app/Java/jdk“) in WebLogic files with the path of the target folder (“/app/Java/jdk1.8.0_192“).

It’s unfortunate, really, but WebLogic will run as planned and it will use “/app/Java/jdk/bin/java” at the beginning and as soon as you upgrade WebLogic or when you will run an installer, you will find out that WebLogic switched to “/app/Java/jdk1.8.0_192/bin/java” on its own and it overwritten the configuration that you put in place.

If you are using WLS 12.2.1.3 and you therefore have access to the setProperty.sh script (see section II.a below), it will also not accept a symlink, it will just set the JAVA_HOME value to the target folder behind the symlink:

[weblogic@weblogic_server_01 bin]$ ./setProperty.sh -name JAVA_HOME -value /app/Java/jdk
Property JAVA_HOME successfully set to "/app/Java/jdk1.8.0_192"
[weblogic@weblogic_server_01 bin]$

 

So, in summary, don’t use symlink, it will be pretty for a few minutes/hours/days but in the end, it won’t work anymore and you will therefore loose the purpose of using that.

 

II. Manually replace file content with the new JAVA_HOME

Here you have two choices again:

a. Minimal update

As mentioned in my previous blog, there are some specific configuration files that are used when starting components. If you want to change the path of your JAVA_HOME without too much work, a quick (& a little bit dirty) way is to update only the necessary files:

  • ${DOMAIN_HOME}/bin/setDomainEnv.sh
  • ${DOMAIN_HOME}/nodemanager/nodemanager.properties
  • (WLS 12.1.x) ${MW_HOME}/oracle_common/common/bin/commEnv.sh
  • (WLS 12.2.x) ${DOMAIN_HOME}/bin/setNMJavaHome.sh
  • (WLS 12.2.1.2 and below) ${MW_HOME}/oracle_common/common/bin/commBaseEnv.sh
  • (WLS 12.2.1.3 and above) ${MW_HOME}/oui/.globalEnv.properties

So, as you can see, if you want to update the JAVA_HOME in a very simple WebLogic Server, you need to change three or four files which will change depending on the version of WebLogic that you are using…

If you are using WLS 12.2.1.3, you don’t have to update the file “${MW_HOME}/oui/.globalEnv.properties” directly, there is actually an utility provided since 12.2.1.3 that will do it for you (more information here):

[weblogic@weblogic_server_01 ~]$ cd ${MW_HOME}/oui/bin/
[weblogic@weblogic_server_01 bin]$
[weblogic@weblogic_server_01 bin]$ cat ../.globalEnv.properties
#This file is automatically generated
#Sat May 18 14:34:24 UTC 2019
JAVA_HOME=/app/Java/jdk1.8.0_151
JAVA_HOME_1_8=/app/Java/jdk1.8.0_151
JVM_64=
[weblogic@weblogic_server_01 bin]$
[weblogic@weblogic_server_01 bin]$ ./getProperty.sh JAVA_HOME
/app/Java/jdk1.8.0_151
[weblogic@weblogic_server_01 bin]$
[weblogic@weblogic_server_01 bin]$ ./setProperty.sh -name JAVA_HOME -value /app/Java/jdk1.8.0_192
Property JAVA_HOME successfully set to "/app/Java/jdk1.8.0_192"
[weblogic@weblogic_server_01 bin]$
[weblogic@weblogic_server_01 bin]$ cat ../.globalEnv.properties
#This file is automatically generated
#Sat May 18 15:02:54 UTC 2019
JAVA_HOME=/app/Java/jdk1.8.0_192
JAVA_HOME_1_8=/app/Java/jdk1.8.0_151
JVM_64=
[weblogic@weblogic_server_01 bin]$

 

This solution is working but it’s not very clean. You will remain with references to the old path in a lot of files so at some point, you (or someone else) might wonder which Java is actually used. That’s why I personally don’t really like this solution.

b. Cleaner update

Instead of updating only the minimal files, what about being more thorough and update all the files with the mention of the Java path from the old one to the new one? This requires some (not so complicated) commands to find all relevant files, excluding logs, excluding previous patch files, excluding binaries and a few other things and then you can just replace the old Java with the new one.

This is a better solution but it has a few drawbacks:

  • You absolutely need to know what you are doing and you need to test it thoroughly. Don’t try sed commands (or whatever you prefer) to update files directly if you aren’t sure about what you are doing in the first place
  • It takes some time to list all files and update them properly, especially if there is a lot of files under $DOMAIN_HOME or $MW_HOME/$ORACLE_HOME. Even if you automated it, since your Operating System still needs to go through all the existing files and since this needs to be done while WebLogic isn’t running, it will increase the downtime needed
  • This would need to be done for each and every Java patch/update

Despite these drawbacks, I still prefer this solution because it’s much cleaner. No end-user would care if there are still references to an old Java in the WebLogic files, as long as the Application is working for them but I would know that it’s there and it would drive me crazy (a little bit :D).

Below are some example commands that can be used to do that. Please test them and update them as per your need! If you are using different components or additional ones, you might want to take that into account.

[weblogic@weblogic_server_01 ~]$ OLD_JAVA="/app/Java/jdk1.8.0_151"
[weblogic@weblogic_server_01 ~]$ NEW_JAVA="/app/Java/jdk1.8.0_192"
[weblogic@weblogic_server_01 ~]$
[weblogic@weblogic_server_01 ~]$ grep -R "${OLD_JAVA}" $DOMAIN_HOME | grep -vE "^[^:]*.log:|^[^:]*/logs/|^[^:]*/nohupLogs/|^[^:]*/.patch_storage/|^Binary file " > ~/listJavaBefore_DOMAIN_HOME
[weblogic@weblogic_server_01 ~]$
[weblogic@weblogic_server_01 ~]$ grep -R "${OLD_JAVA}" $ORACLE_HOME | grep -vE "^[^:]*.log:|^[^:]*/logs/|^[^:]*/nohupLogs/|^[^:]*/.patch_storage/|^Binary file " > ~/listJavaBefore_ORACLE_HOME
[weblogic@weblogic_server_01 ~]$
[weblogic@weblogic_server_01 ~]$ awk -F':' '{print $1}' ~/listJavaBefore_DOMAIN_HOME | sort -u; echo
...

[weblogic@weblogic_server_01 ~]$
[weblogic@weblogic_server_01 ~]$ awk -F':' '{print $1}' ~/listJavaBefore_ORACLE_HOME | sort -u; echo
...

[weblogic@weblogic_server_01 ~]$
[weblogic@weblogic_server_01 ~]$ while read line; do FILE_TO_UPDATE=`echo ${line} | awk -F':' '{print $1}'`; sed -i "s,${OLD_JAVA},${NEW_JAVA},g" ${FILE_TO_UPDATE}; done < ~/listJavaBefore_DOMAIN_HOME
[weblogic@weblogic_server_01 ~]$
[weblogic@weblogic_server_01 ~]$ while read line; do FILE_TO_UPDATE=`echo ${line} | awk -F':' '{print $1}'`; sed -i "s,${OLD_JAVA},${NEW_JAVA},g" ${FILE_TO_UPDATE}; done < ~/listJavaBefore_ORACLE_HOME
[weblogic@weblogic_server_01 ~]$

 

The grep commands above list all occurrences of the $OLD_JAVA path, the awk commands list all the files that contain one or more occurrences and the while loop replace all occurrences with the value of the $NEW_JAVA. Once that’s done, you can check again with the grep command using both OLD_JAVA and NEW_JAVA to ensure that all references to the old JAVA_HOME are now all properly replaced with the new JAVA_HOME. Instead of OLD_JAVA, you can also use a regex, if you prefer, in case you had several Java versions in the past so it matches all of them and not just the last one. Something like ‘grep “${JAVA_BASE}/[a-zA-Z0-9._]*” …‘ for example or ‘grep “${JAVA_BASE}/jdk[0-9._]*” …

 

III. Use a generic folder

From my opinion, none of the above solutions matter because this one is the best solution and it should be the one that everybody uses. What I mean by a “generic folder” is simply a folder whose name never changes. For that purpose, I’m usually very happy with symlinks for other Application/Application Server but for WebLogic this needs to be a real folder.

In case of an upgrade, you will need to execute both this section III as well as the section II.b so that all references points to the generic folder. On the other hand, in case of a new installation, it’s just all about using the generic folder’s path while installing WebLogic and then you will never have anything else to do. You can see this section III as being somehow the opposite of the section I.

Below, I’m just taking the status at the section I where we have a symlink named “jdk” that point to “jdk1.8.0_192“. I’m changing that, removing the symlink, renaming the folder “jdk1.8.0_151” into “jdk” and then creating a new symlink named “jdk1.8.0_151” that points to “jdk” so it’s the other way around. With that, we have a generic folder named “jdk” that can old all/any Java version and we will never have to change the JAVA_HOME again:

[weblogic@weblogic_server_01 Java]$ ls -l
total 0
lrwxrwxrwx. 1 weblogic weblogic 13 May 18 14:06 jdk -> jdk1.8.0_192/
drwxr-xr-x. 1 weblogic weblogic 17 May 18 13:55 jdk1.8.0_151
drwxr-xr-x. 1 weblogic weblogic 17 May 18 14:02 jdk1.8.0_192
[weblogic@weblogic_server_01 Java]$
[weblogic@weblogic_server_01 Java]$ $JAVA_HOME/bin/java -version
java version "1.8.0_192"
Java(TM) SE Runtime Environment (build 1.8.0_192-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.192-b12, mixed mode)
[weblogic@weblogic_server_01 Java]$
[weblogic@weblogic_server_01 Java]$ rm jdk
[weblogic@weblogic_server_01 Java]$ mv jdk1.8.0_151 jdk
[weblogic@weblogic_server_01 Java]$ ln -s jdk/ jdk1.8.0_151
[weblogic@weblogic_server_01 Java]$
[weblogic@weblogic_server_01 Java]$ ls -l
total 0
drwxr-xr-x. 1 weblogic weblogic 17 May 18 13:55 jdk
lrwxrwxrwx. 1 weblogic weblogic 13 May 18 16:22 jdk1.8.0_151 -> jdk/
drwxr-xr-x. 1 weblogic weblogic 17 May 18 14:02 jdk1.8.0_192
[weblogic@weblogic_server_01 Java]$
[weblogic@weblogic_server_01 Java]$ $JAVA_HOME/bin/java -version
java version "1.8.0_151"
Java(TM) SE Runtime Environment (build 1.8.0_151-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.151-b12, mixed mode)
[weblogic@weblogic_server_01 Java]$

 

As mentioned above, at this point, if you are going to install a new WebLogic Server, then just use “/app/Java/jdk” as the JAVA_HOME and you will never have to touch any WebLogic files. If you have already a WebLogic Server installed and using something else than “/app/Java/jdk“, then you will have to do the section II.b. Section II.b is designed to be executed each time you change your Java version but here, since the folder name will always stay the same, you can do it once only.

To upgrade your version of Java once you are using the generic folder, it’s that simple:

[weblogic@weblogic_server_01 Java]$ ls -l
total 0
drwxr-xr-x. 1 weblogic weblogic 17 May 18 13:55 jdk
lrwxrwxrwx. 1 weblogic weblogic 13 May 18 16:22 jdk1.8.0_151 -> jdk/
drwxr-xr-x. 1 weblogic weblogic 17 May 18 14:02 jdk1.8.0_192
[weblogic@weblogic_server_01 Java]$
[weblogic@weblogic_server_01 Java]$ $JAVA_HOME/bin/java -version
java version "1.8.0_151"
Java(TM) SE Runtime Environment (build 1.8.0_151-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.151-b12, mixed mode)
[weblogic@weblogic_server_01 Java]$
[weblogic@weblogic_server_01 Java]$ rm jdk1.8.0_151
[weblogic@weblogic_server_01 Java]$ mv jdk jdk1.8.0_151
[weblogic@weblogic_server_01 Java]$ mv jdk1.8.0_192 jdk
[weblogic@weblogic_server_01 Java]$ ln -s jdk/ jdk1.8.0_192
[weblogic@weblogic_server_01 Java]$
[weblogic@weblogic_server_01 Java]$ ls -l
total 0
drwxr-xr-x. 1 weblogic weblogic 17 May 18 14:02 jdk
drwxr-xr-x. 1 weblogic weblogic 17 May 18 13:55 jdk1.8.0_151
lrwxrwxrwx. 1 weblogic weblogic 13 May 18 16:28 jdk1.8.0_192 -> jdk/
[weblogic@weblogic_server_01 Java]$
[weblogic@weblogic_server_01 Java]$ $JAVA_HOME/bin/java -version
java version "1.8.0_192"
Java(TM) SE Runtime Environment (build 1.8.0_192-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.192-b12, mixed mode)
[weblogic@weblogic_server_01 Java]$

 

Once that’s done, simply restart the WebLogic processes and you are done, Java has been upgraded without effort, the WebLogic files are still referencing the correct JAVA_HOME and it will stay clean, efficient and working, forever.

 

Cet article WebLogic – Upgrade of Java est apparu en premier sur Blog dbi services.

WebLogic – JAVA_HOME in WebLogic, a nightmare?

Yann Neuhaus - Sat, 2019-06-01 03:00

Everybody knows Java but not everybody loves Java. With everything Oracle is currently doing regarding Java, I can’t say I’m blaming them… But that’s not the topic of this blog! hurrah. Well actually I’m not sure the topic of this blog is much better since I will talk about the management of the JAVA_HOME environment variable with WebLogic and more specifically a very simple WebLogic Server in standalone, nothing more. I always wanted to write a blog about how to properly upgrade Java when using WebLogic so I thought about writing this blog first, as an introduction.

Before going deeply into how WebLogic is managing the JAVA_HOME (and therefore how it manages which Java version is used), in case you will be wondering below, these are some possible values for JAVA_VENDOR: Oracle, HP, IBM, Sun. I assume most people aren’t using this environment variable, they are just using JAVA_HOME and expect it to be working. Let’s see what WebLogic thinks about that…

 

I. JAVA_HOME in WLS 12.1.3

I didn’t check for all versions since it’s a quite lengthy and boring process, but I believe this section applies to all 12.1.x versions. In all sections below, ${MW_HOME} is the same as ${ORACLE_HOME} and ${WL_HOME} is ${ORACLE_HOME}/wlserver.

If you want to start a NodeManager, this is what WebLogic 12.1.3 will do in regard to the JAVA_HOME variable:

  • execute ${DOMAIN_HOME}/bin/startNodeManager.sh
    • execute ${WL_HOME}/server/bin/startNodeManager.sh
      • source ${MW_HOME}/oracle_common/common/bin/commEnv.sh ~~> If the environment variable JAVA_HOME is set, that’s good… But if JAVA_VENDOR isn’t set as well, then your JAVA_HOME isn’t used and instead it uses a hardcoded value set in this file directly. If the new JAVA_HOME’s folder (It specifically checks if it’s a FOLDER !!!) doesn’t exist, then it takes the environment variable JAVA_HOME that you defined but if the folder exists, it keeps the JAVA_HOME with the hardcoded value.

I believe this was done to hide the misery that it’s to handle JAVA_HOME in WebLogic… Basically if you upgraded your Java and replaced the JAVA_HOME environment variable without touching the WebLogic files, WebLogic would try to use the old Java and if the folder isn’t there anymore because you removed it, then it would use the value coming from your JAVA_HOME… Why would they do that? Don’t ask me.

 

If you want to start a Managed Server (it’s the same thing for the AdminServer, except the first line), this is what WebLogic 12.1.3 will do in regard to the JAVA_HOME variable:

  • execute ${DOMAIN_HOME}/bin/startManagedWebLogic.sh
    • execute ${DOMAIN_HOME}/bin/startWebLogic.sh
      • source ${DOMAIN_HOME}/bin/setDomainEnv.sh ~~> It doesn’t care if JAVA_HOME is set, it overwrites it. The value will depend on VM_TYPE and JAVA_VENDOR but it overwrites it using the hardcoded value from this file directly. At this point, JAVA_HOME is set and JAVA_VENDOR is overwritten, except if VM_TYPE=JRockit in which case JAVA_HOME is “@DEFAULT_BEA_JAVA_HOME” and JAVA_VENDOR isn’t set…
        • source ${WL_HOME}/common/bin/commEnv.sh
          • source ${MW_HOME}/oracle_common/common/bin/commEnv.sh ~~> Same script as for the NodeManager, except that this time, JAVA_HOME was overwritten by the setDomainEnv.sh script already… So, if JAVA_HOME OR JAVA_VENDOR isn’t set, then JAVA_HOME is again overwritten by yet another hardcoded value…
        • source ${DOMAIN_HOME}/bin/setStartupEnv.sh
        • source ${DOMAIN_HOME}/bin/setUserOverrides.sh

Summary: It’s a nightmare isn’t it? If you want to be sure that your environment variable JAVA_HOME is used in the end without being overwritten, then… Good luck because that’s not possible. For the NodeManager, it would be possible since you just need to define both JAVA_HOME and JAVA_VENDOR (or make sure you remove any old JDK from the FileSystem so it falls back to your JAVA_HOME) but for the Managed Servers, it’s not possible. Well actually it would be possible by setting your JAVA_HOME variable into the “${DOMAIN_HOME}/bin/setUserOverrides.sh” file… Since this file is loaded at the very end, it would use your values but please don’t do that, it’s so ugly!

You think that’s complicated? Please read below, there is more.

 

II. JAVA_HOME in WLS 12.2.1.2

In WLS 12.2, Oracle changed the way the JAVA_HOME is handled for the NodeManager and for the Managed Servers by introducing several things:

  • A new script is used to set the Java Home for the NodeManager: {DOMAIN_HOME}/bin/setNMJavaHome.sh
  • Two new scripts are used to set the environment: the script ${MW_HOME}/oracle_common/common/bin/commEnv.sh now doesn’t contain anything anymore but instead, it loads ${MW_HOME}/oracle_common/common/bin/commBaseEnv.sh as well as ${MW_HOME}/oracle_common/common/bin/commExtEnv.sh for both the NodeManager and the Managed Servers

 

If you want to start a NodeManager, this is what WebLogic 12.2.1.2 will do in regard to the JAVA_HOME variable:

  • execute ${DOMAIN_HOME}/bin/startNodeManager.sh
    • source ${DOMAIN_HOME}/bin/setNMJavaHome.sh ~~> It doesn’t care if JAVA_HOME is set, it overwrites it. The value will depend on VM_TYPE and JAVA_VENDOR but it overwrites it using the hardcoded value from this file directly. At this point, JAVA_HOME is set and JAVA_VENDOR is overwritten, except if VM_TYPE=JRockit in which case JAVA_HOME is empty and JAVA_VENDOR isn’t set…
    • execute ${WL_HOME}/server/bin/startNodeManager.sh
      • source ${MW_HOME}/oracle_common/common/bin/commEnv.sh
        • source ${MW_HOME}/oracle_common/common/bin/commBaseEnv.sh ~~> Different script but same piece of code as for the NodeManager in WLS 12.1.3, except that this time, JAVA_HOME was overwritten by the setNMJavaHome.sh script already… So, if JAVA_HOME OR JAVA_VENDOR isn’t set, then JAVA_HOME is again overwritten by yet another hardcoded value…
        • source ${MW_HOME}/oracle_common/common/bin/commExtEnv.sh

 

If you want to start a Managed Server, this is what WebLogic 12.2.1.2 will do in regard to the JAVA_HOME variable:

  • execute ${DOMAIN_HOME}/bin/startManagedWebLogic.sh
    • execute ${DOMAIN_HOME}/bin/startWebLogic.sh
      • source ${DOMAIN_HOME}/bin/setDomainEnv.sh ~~> It doesn’t care if JAVA_HOME is set, it overwrites it. The value will depend on VM_TYPE and JAVA_VENDOR but it overwrites it using the hardcoded value from this file directly. At this point, JAVA_HOME is set and JAVA_VENDOR is overwritten, except if VM_TYPE=JRockit in which case JAVA_HOME is “@DEFAULT_BEA_JAVA_HOME” and JAVA_VENDOR isn’t set…
        • source ${MW_HOME}/oracle_common/common/bin/commEnv.sh
          • source ${MW_HOME}/oracle_common/common/bin/commBaseEnv.sh ~~> Same script as for the NodeManager, except that this time, JAVA_HOME was overwritten by the setDomainEnv.sh script already… So, if JAVA_HOME OR JAVA_VENDOR isn’t set, then JAVA_HOME is again overwritten by yet another hardcoded value…
          • source ${MW_HOME}/oracle_common/common/bin/commExtEnv.sh
        • source ${DOMAIN_HOME}/bin/setStartupEnv.sh
        • source ${DOMAIN_HOME}/bin/setUserOverrides.sh

 

III. JAVA_HOME in WLS 12.2.1.3

In WLS 12.2.1.3, Oracle changed again the way the JAVA_HOME is handled for the NodeManager and for the Managed Servers by introducing one more thing: instead of hardcoding the JAVA_HOME in ${MW_HOME}/oracle_common/common/bin/commBaseEnv.sh, it now retrieve the value using a script (${MW_HOME}/oui/bin/getProperty.sh) that read a specific file (${MW_HOME}/oui/.globalEnv.properties).

If you want to start a NodeManager, this is what WebLogic 12.2.1.3 will do in regard to the JAVA_HOME variable:

  • execute ${DOMAIN_HOME}/bin/startNodeManager.sh
    • source ${DOMAIN_HOME}/bin/setNMJavaHome.sh ~~> Same script as for the NodeManager in WLS 12.2.1.2
    • execute ${WL_HOME}/server/bin/startNodeManager.sh
      • source ${MW_HOME}/oracle_common/common/bin/commEnv.sh
        • source ${MW_HOME}/oracle_common/common/bin/commBaseEnv.sh ~~> (Almost) Same script as for the NodeManager in WLS 12.2.1.2, except that this time, JAVA_HOME is again overwritten by retrieving properties value using the new script
          • execute ${MW_HOME}/oui/bin/getProperty.sh JAVA_HOME
            • read ${MW_HOME}/oui/.globalEnv.properties ~~> Always overwrite JAVA_HOME with the hardcoded value
        • source ${MW_HOME}/oracle_common/common/bin/commExtEnv.sh

 

If you want to start a Managed Server, this is what WebLogic 12.2.1.3 will do in regard to the JAVA_HOME variable:

  • execute ${DOMAIN_HOME}/bin/startManagedWebLogic.sh
    • execute ${DOMAIN_HOME}/bin/startWebLogic.sh
      • source ${DOMAIN_HOME}/bin/setDomainEnv.sh ~~> Same script as for the Managed Servers in WLS 12.2.1.2
        • source ${MW_HOME}/oracle_common/common/bin/commEnv.sh
          • source ${MW_HOME}/oracle_common/common/bin/commBaseEnv.sh ~~> (Almost) Same script as for the Managed Servers in WLS 12.2.1.2, except that this time, JAVA_HOME is again overwritten by retrieving properties value using the new script
            • execute ${MW_HOME}/oui/bin/getProperty.sh JAVA_HOME
              • read ${MW_HOME}/oui/.globalEnv.properties ~~> Always overwrite JAVA_HOME with the hardcoded value
          • source ${MW_HOME}/oracle_common/common/bin/commExtEnv.sh
        • source ${DOMAIN_HOME}/bin/setStartupEnv.sh
        • source ${DOMAIN_HOME}/bin/setUserOverrides.sh
        • source ${DOMAIN_HOME}/bin/setUserOverridesLate.sh

 

Easy right? Joke aside, even if it’s a huge mess, with WLS 12.2.1.3, you now have the file “${MW_HOME}/oui/.globalEnv.properties” which is used by both the NodeManager as well as the Admin/Managed Servers and therefore if you overwrite the value in this file, you would think that you are good to go right? Well… “Why would it be so easy?”, that’s probably what Oracle thought when they were thinking about how they should handle the JAVA_HOME internally ;).

Instead of going top to bottom, let’s go the other way around. Let’s say that you updated the value of JAVA_HOME inside the file “${MW_HOME}/oui/.globalEnv.properties”. This file is read using the “${MW_HOME}/oui/bin/getProperty.sh” script which is executed by “${MW_HOME}/oracle_common/common/bin/commBaseEnv.sh”. To execute the getProperty.sh, you need either the JAVA_HOME or the JAVA_VENDOR to be not set or empty.

Before the commBaseEnv.sh script, the only one that contains references to JAVA_HOME or JAVA_VENDOR is:

  • For the NodeManager: ${DOMAIN_HOME}/bin/setNMJavaHome.sh
    • JAVA_HOME can be empty only if VM_TYPE=JRockit, otherwise it’s always overwritten with the hardcoded values
    • JAVA_VENDOR can be empty only if VM_TYPE=JRockit, otherwise it’s always overwritten with “Oracle”
    • Conclusion: To be able to load the value from “${MW_HOME}/oui/.globalEnv.properties”, JAVA_HOME or JAVA_VENDOR needs to be empty so you need to use VM_TYPE=JRockit if you want to use this new “feature” added in WLS 12.2.1.3… And obviously, if your aren’t using a JRockit VM (which, by the way, doesn’t exist anymore: latest stable release was 7 years ago…), then the NodeManager won’t start so good luck trying to use this “${MW_HOME}/oui/.globalEnv.properties” file because it’s impossible ;).
  • For the Managed Servers: ${DOMAIN_HOME}/bin/setDomainEnv.sh
    • JAVA_HOME cannot be empty or not set because of the hardcoded values in this file and even if VM_TYPE=JRockit, then JAVA_HOME is set to “@DEFAULT_BEA_JAVA_HOME”
    • JAVA_VENDOR can be empty only if VM_TYPE=JRockit, otherwise it’s always overwritten with “Oracle”
    • Conclusion: To be able to load the value from “${MW_HOME}/oui/.globalEnv.properties”, JAVA_VENDOR needs to be empty so you need to use VM_TYPE=JRockit… Same conclusion as for the NodeManager, it’s not possible.

So, why did they introduce this new file in WLS 12.2.1.3? Well I can only hope that it is used somewhere! One thing is certain, it’s not for the NodeManager, AdminServer or Managed Servers (there is not much left… :D). If you are aware of anything that would make some use of that, don’t hesitate to share!

So, in summary, WebLogic basically doesn’t care if you set a JAVA_HOME environment variable, it will anyway just use its hardcoded values, contrary to most Application Servers which trusts the people installing/configuring them to set the JAVA_HOME to the expected value. In a future blog, I will talk about the upgrade process of Java in the scope of a WebLogic Server, to make it smoother (well, as much as technically possible… ;)).

 

Cet article WebLogic – JAVA_HOME in WebLogic, a nightmare? est apparu en premier sur Blog dbi services.

Documentum – DOCUMENTUM_SHARED is dead?

Yann Neuhaus - Sat, 2019-06-01 02:30

In June last year, I did my first manual installation (so without docker) of Documentum 16.4 and I was testing it with PostgreSQL. I quickly realized that there were some changes in Documentum and, unfortunately, I don’t believe that it’s for the best! In this blog, I will talk about the DOCUMENTUM_SHARED environment variable. I tested that almost a year ago with a PostgreSQL binary but it’s the same for all Documentum 16.4 binaries. This isn’t a very technical blog, it’s more like a small reflection about what OpenText is currently doing.

 

I. DOCUMENTUM_SHARED is dead

 

In Documentum 7.3 or below, you could define an environment variable named DOCUMENTUM_SHARED before installing Documentum (see this blog for example) which would then be used to define where the Shared components of a Content Server should be installed. This include mainly the following:

  • The DFC properties and libraries
  • The Java Method Server (JBoss/WildFly + all Applications)
  • The Java binaries

Starting with Documentum 16.4, this environment variable has been deprecated (see KB11002330) and Documentum will simply ignore it. So, you will end-up with all the above components being installed right under $DOCUMENTUM, with everything else. I don’t like that because on Linux, we are used to split things and therefore, we are used to have only a few folders under $DOCUMENTUM and a few others under $DOCUMENTUM_SHARED. Now everything is put under $DOCUMENTUM and even the DFC files/folders. By default in your 16.4 dfc.properties, you will have a definition of “dfc.data.dir=…” which points to $DOCUMENTUM as well ($DOCUMENTUM_SHARED before) so you will end-up with a lot of ugly stuff right under $DOCUMENTUM and it becomes messy! These are the DFC files/folder I’m talking about:

  • $DOCUMENTUM/apptoken/
  • $DOCUMENTUM/cache/
  • $DOCUMENTUM/checkout/
  • $DOCUMENTUM/export/
  • $DOCUMENTUM/identityInterprocessMutex.lock
  • $DOCUMENTUM/local/
  • $DOCUMENTUM/logs/

Obviously you can change the definition of the “dfc.data.dir” so this will be put elsewhere and you should really do that for all dfc.properties file but that’s kind of surprising. When I’m doing a review of an environment or an assessment of some sort, the first thing I’m always doing is going to the $DOCUMENTUM folder and listing its content. If this folder is clean (no log file, no backup, no temp files, no cache files, aso…), then there is a good chance that the complete installation is more or less clean as well. If there is a lot of mess even on the $DOCUMENTUM folder, then I know that it’ll be a long day.

 

II. long live DOCUMENTUM_SHARED! (for now)

 

So why am I saying that? Well as always when you try to deprecate something, there are leftovers here and there and it’s pretty hard to change people’s mind… Take for example the “docbase” VS “repository”… Since Documentum 7.0, a “docbase” is now officially called a “repository” but yet, a lot of people still uses “docbase” and even Documentum does (there are a lot of remaining references everywhere). I believe it will be the same for DOCUMENTUM_SHARED.

At the moment in Documentum 16.4, there are the following references to DOCUMENTUM_SHARED:

  • D2 16.4 still uses DOCUMENTUM_SHARED to know where the components are installed. This is used to deploy D2 libraries into the JMS mainly. I didn’t check but I guess it will be the same for the BPM/xCP
  • MigrationUtil (change docbase ID, docbase name, server config name, aso…) still uses DOCUMENTUM_SHARED to know where the dfc.properties is, where the JMS is, aso…
  • dm_set_server_env scripts still uses DOCUMENTUM_SHARED for defining other variables like LD_LIBRARY_PATH or CLASSPATH

Because of these remaining references (and probably much more), OpenText didn’t just remove completely the DOCUMENTUM_SHARED variable… No, it’s still there but they put it, with a hardcoded value (same as $DOCUMENTUM), directly into the dm_set_server_env scripts so other references are still working properly.

OpenText just probably didn’t want to completely remove the environment variable directly so they are proceeding step by step. First ignoring it and they will probably remove it completely in a future major version. Until then, I will continue to define my DOCUMENTUM_SHARED environment variable but for Documentum 16.4, I will set it with the same value as DOCUMENTUM because we never know, maybe in the next version, the variable will come back… ;).

 

Cet article Documentum – DOCUMENTUM_SHARED is dead? est apparu en premier sur Blog dbi services.

[New feature] Customize Backup Retention Period of DB System In Oracle Cloud Infrastructure (OCI)

Online Apps DBA - Fri, 2019-05-31 23:46

[New Feature] Customize Backup Retention on OCI DB System. Backup in OCI till recently was retained for 30 Days only by default and there was no way to change this in Automated backups. Backup Retention is now enhanced for DB System on OCI Check this enhancement at https://k21academy.com/clouddba46 [New Feature] Customize Backup Retention on OCI […]

The post [New feature] Customize Backup Retention Period of DB System In Oracle Cloud Infrastructure (OCI) appeared first on Oracle Trainings for Apps & Fusion DBA.

Categories: APPS Blogs

Oracle Platinum Services for Engineered Systems—No Additional Charge

Chris Warticki - Fri, 2019-05-31 16:19
Fully Realize the Benefits of Your Oracle Engineered Systems with Oracle Platinum Services

We launched Oracle Platinum Services in 2012, and what began as a breakthrough support offering for Exadata customers has evolved into an integral aspect of Oracle Support, designed to help customers fully realize the benefits of their Oracle Engineered Systems.

Today there are over 5,700 Oracle Engineered Systems implemented on Oracle Platinum Services—all entitled to 24x7 remote fault monitoring, quarterly patch deployment services, and unprecedented response times for fault resolution and restoration—at no additional charge. The Oracle Platinum Services page has more information on what Oracle Platinum Services offers and how other customers are leveraging the features of the service.

Considering Oracle Platinum Services? Have Questions?

Topics to consider before boarding your systems to Oracle Platinum Services. You must ensure that it addresses the needs of your organization while meeting any security or other guidelines your organization might have in place. Your next consideration should focus on how you can meet the requirements to qualify for Oracle Platinum Services.

The Oracle Platinum Services FAQ is an excellent resource to start with, to get answers to general questions. Further inquiries can be made of your Technical Account Manager, if you wish to further investigate your options. They can help direct you to an Oracle Platinum Driver, a resource skilled in helping customers navigate questions related to implementing Oracle Platinum Services on their Oracle Engineered Systems.  

Learn More

You can find more information in the Oracle Platinum Services Onboarding document, which lays out what customers can expect when boarding their systems on to Oracle Platinum Services.

 

 

 

 

 

 

 

Resources

PostgreSQL 12: Control when generic plans are used

Yann Neuhaus - Fri, 2019-05-31 13:02

When you are using prepared statements in PostgreSQL you might get a custom or a generic plan. Custom plans come with overhead of re-planning while generic plans avoid re-planning of the statement. A soon as you get a generic plan that plan will be used for the lifetime of your session and there are situations when you do not want to see this as a generic plan might be more expensive than a custom plan. Starting with PostgreSQL 12 (which currently is in beta) you have more control over this.

Let use the same simple test case as in the previous post about custom and generic plans:

postgres=# select version();
                                                  version                                                  
-----------------------------------------------------------------------------------------------------------
 PostgreSQL 12devel on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.2.1 20180905 (Red Hat 8.2.1-3), 64-bit
(1 row)

postgres=# create table demo ( a int, b text );
CREATE TABLE
postgres=# insert into demo select i, 'aaa' from generate_series (1,100) i;
INSERT 0 100
postgres=# insert into demo select i, 'bbb' from generate_series (101,200) i;
INSERT 0 100
postgres=# insert into demo select i, 'ccc' from generate_series (201,300) i;
INSERT 0 100
postgres=# analyze demo;
ANALYZE
postgres=# 

When we prepare and then execute the blow statement 6 times we will see a generic plan:

postgres=# prepare my_stmt as select * from demo where b = $1;
PREPARE
postgres=# explain (analyze) execute my_stmt ( 'aaa' );
                                            QUERY PLAN                                            
--------------------------------------------------------------------------------------------------
 Seq Scan on demo  (cost=0.00..5.75 rows=100 width=8) (actual time=0.027..0.107 rows=100 loops=1)
   Filter: (b = 'aaa'::text)
   Rows Removed by Filter: 200
 Planning Time: 0.431 ms
 Execution Time: 0.198 ms
(5 rows)

postgres=# explain (analyze) execute my_stmt ( 'aaa' );
                                            QUERY PLAN                                            
--------------------------------------------------------------------------------------------------
 Seq Scan on demo  (cost=0.00..5.75 rows=100 width=8) (actual time=0.032..0.113 rows=100 loops=1)
   Filter: (b = 'aaa'::text)
   Rows Removed by Filter: 200
 Planning Time: 0.196 ms
 Execution Time: 0.155 ms
(5 rows)

postgres=# explain (analyze) execute my_stmt ( 'aaa' );
                                            QUERY PLAN                                            
--------------------------------------------------------------------------------------------------
 Seq Scan on demo  (cost=0.00..5.75 rows=100 width=8) (actual time=0.032..0.113 rows=100 loops=1)
   Filter: (b = 'aaa'::text)
   Rows Removed by Filter: 200
 Planning Time: 0.168 ms
 Execution Time: 0.154 ms
(5 rows)

postgres=# explain (analyze) execute my_stmt ( 'aaa' );
                                            QUERY PLAN                                            
--------------------------------------------------------------------------------------------------
 Seq Scan on demo  (cost=0.00..5.75 rows=100 width=8) (actual time=0.041..0.135 rows=100 loops=1)
   Filter: (b = 'aaa'::text)
   Rows Removed by Filter: 200
 Planning Time: 0.238 ms
 Execution Time: 0.193 ms
(5 rows)

postgres=# explain (analyze) execute my_stmt ( 'aaa' );
                                            QUERY PLAN                                            
--------------------------------------------------------------------------------------------------
 Seq Scan on demo  (cost=0.00..5.75 rows=100 width=8) (actual time=0.031..0.112 rows=100 loops=1)
   Filter: (b = 'aaa'::text)
   Rows Removed by Filter: 200
 Planning Time: 0.169 ms
 Execution Time: 0.154 ms
(5 rows)

postgres=# explain (analyze) execute my_stmt ( 'aaa' );
                                            QUERY PLAN                                            
--------------------------------------------------------------------------------------------------
 Seq Scan on demo  (cost=0.00..5.75 rows=100 width=8) (actual time=0.033..0.120 rows=100 loops=1)
   Filter: (b = $1)
   Rows Removed by Filter: 200
 Planning Time: 0.163 ms
 Execution Time: 0.163 ms
(5 rows)

PostgreSQL 12 will come with a new parameter to control that:

postgres=# show plan_cache_mode;
 plan_cache_mode 
-----------------
 auto
(1 row)
postgres=# select extra_desc from pg_settings where name = 'plan_cache_mode';
                                                                            extra_desc                                                                             
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Prepared statements can have custom and generic plans, and the planner will attempt to choose which is better.  This can be set to override the default behavior.
(1 row)

The default value is “auto” which means the same behavior is in previous versions of PostgreSQL. If you want to force custom plans you can set it to “force_custom_plan” or you can set it to “force_generic_plan” if you want to only see generic plans. Using the same example from above but setting the parameter to “force_custom_plan” we will not see a generic plan anymore:

postgres=# set plan_cache_mode = force_custom_plan;
SET
postgres=# explain (analyze) execute my_stmt ( 'aaa' );
                                            QUERY PLAN                                            
--------------------------------------------------------------------------------------------------
 Seq Scan on demo  (cost=0.00..5.75 rows=100 width=8) (actual time=0.034..0.127 rows=100 loops=1)
   Filter: (b = 'aaa'::text)
   Rows Removed by Filter: 200
 Planning Time: 0.209 ms
 Execution Time: 0.172 ms
(5 rows)

postgres=# explain (analyze) execute my_stmt ( 'aaa' );
                                            QUERY PLAN                                            
--------------------------------------------------------------------------------------------------
 Seq Scan on demo  (cost=0.00..5.75 rows=100 width=8) (actual time=0.152..0.236 rows=100 loops=1)
   Filter: (b = 'aaa'::text)
   Rows Removed by Filter: 200
 Planning Time: 0.170 ms
 Execution Time: 0.279 ms
(5 rows)

postgres=# explain (analyze) execute my_stmt ( 'aaa' );
                                            QUERY PLAN                                            
--------------------------------------------------------------------------------------------------
 Seq Scan on demo  (cost=0.00..5.75 rows=100 width=8) (actual time=0.031..0.112 rows=100 loops=1)
   Filter: (b = 'aaa'::text)
   Rows Removed by Filter: 200
 Planning Time: 0.170 ms
 Execution Time: 0.152 ms
(5 rows)

postgres=# explain (analyze) execute my_stmt ( 'aaa' );
                                            QUERY PLAN                                            
--------------------------------------------------------------------------------------------------
 Seq Scan on demo  (cost=0.00..5.75 rows=100 width=8) (actual time=0.029..0.122 rows=100 loops=1)
   Filter: (b = 'aaa'::text)
   Rows Removed by Filter: 200
 Planning Time: 0.206 ms
 Execution Time: 0.162 ms
(5 rows)

postgres=# explain (analyze) execute my_stmt ( 'aaa' );
                                            QUERY PLAN                                            
--------------------------------------------------------------------------------------------------
 Seq Scan on demo  (cost=0.00..5.75 rows=100 width=8) (actual time=0.038..0.133 rows=100 loops=1)
   Filter: (b = 'aaa'::text)
   Rows Removed by Filter: 200
 Planning Time: 0.200 ms
 Execution Time: 0.244 ms
(5 rows)

postgres=# explain (analyze) execute my_stmt ( 'aaa' );
                                            QUERY PLAN                                            
--------------------------------------------------------------------------------------------------
 Seq Scan on demo  (cost=0.00..5.75 rows=100 width=8) (actual time=0.032..0.114 rows=100 loops=1)
   Filter: (b = 'aaa'::text)
   Rows Removed by Filter: 200
 Planning Time: 0.169 ms
 Execution Time: 0.155 ms
(5 rows)

postgres=# explain (analyze) execute my_stmt ( 'aaa' );
                                            QUERY PLAN                                            
--------------------------------------------------------------------------------------------------
 Seq Scan on demo  (cost=0.00..5.75 rows=100 width=8) (actual time=0.034..0.117 rows=100 loops=1)
   Filter: (b = 'aaa'::text)
   Rows Removed by Filter: 200
 Planning Time: 0.301 ms
 Execution Time: 0.162 ms
(5 rows)

Nice addition that can help when you have generic plans that actually perform worse than custom plans.

Cet article PostgreSQL 12: Control when generic plans are used est apparu en premier sur Blog dbi services.

Great Football Managers and Oracle Unified Auditing

The Anti-Kyte - Fri, 2019-05-31 11:48

It’s quite a good time for English football at the moment. Not only have English clubs monopolised the finals of the two main European Club competitions this year, but Manchester City have made history by winning all three domestic competitions in the same season.
Note that this isn’t a British footballing first. Glasgow Rangers managed it way back in 1949. And whilst the European Cup ( Champions League if you must) has eluded City this season, Celtic managed that particular clean sweep in 1967.
In English football however, this particular treble is unprecedented. In fact, there are remarkably few managers who have been able to win every one of the major domestic honours in their entire career.
All of which will come in handy when looking for examples to illustrate the topic at hand, namely Oracle Unified Auditing.
With the aid of 18c Express Edition, we’ll be looking at :

  • The Oracle supplied Unified Auditing Policies that are enabled by default
  • Where to find the Audit Trail
  • How to create our own Unified Auditing Policy to monitor DML operations on specific objects

Unified Auditing default settings

Unified Autiding was introduced in 12c. By default, it’s configured to work alongside auditing from earlier database versions.
We can check that this the case – and that traditional auditing has not been disabled by checking that the Unified Auditing parameter is set to FALSE :

Looking at the audit initialization parameters is also useful at this point :

select name, value, description
from v$parameter
where name like '%audit%'
and value is not null
order by name
/

NAME                           VALUE                          DESCRIPTION                                       
------------------------------ ------------------------------ --------------------------------------------------
audit_file_dest                /opt/oracle/admin/XE/adump     Directory in which auditing files are to reside   
audit_sys_operations           TRUE                           enable sys auditing                               
audit_trail                    DB                             enable system auditing                            
unified_audit_sga_queue_size   1048576                        Size of Unified audit SGA Queue                   

We can see that we have a directory setup for audit files to be written to. However, the audit_trail is set to DB – i.e. a table in the database.
To start with, we can have a look at the audit_file_dest directory and see what, if anything, is being written there :

ls -lrt /opt/oracle/admin/XE/adump

...
-rw-r-----. 1 oracle oinstall     866 May 30 13:16 XE_ora_3880_20190530131639680705403060.aud
-rw-r-----. 1 oracle oinstall    1777 May 30 13:16 XE_ora_4375_20190530131639737578276396.aud
-rw-r-----. 1 oracle oinstall    1202 May 30 13:16 XE_ora_4381_20190530131643953498040745.aud

Closer inspection of these files reveal that they contain details of the operations performed when Oracle is started but before the database is open. For example :

cat XE_ora_3880_20190530131639680705403060.aud

Audit file /opt/oracle/admin/XE/adump/XE_ora_3880_20190530131639680705403060.aud
Oracle Database 18c Express Edition Release 18.0.0.0.0 - Production
Version 18.4.0.0.0
Build label:    RDBMS_18.4.0.0.0DBRU_LINUX.X64_181017
ORACLE_HOME:    /opt/oracle/product/18c/dbhomeXE
System name:    Linux
Node name:	frea.virtualbox
Release:        3.10.0-957.1.3.el7.x86_64
Version:        #1 SMP Thu Nov 29 14:49:43 UTC 2018
Machine:        x86_64
Instance name: XE
Redo thread mounted by this instance: 0 <none>
Oracle process number: 299
Unix process pid: 3880, image: oracle@frea.virtualbox (TNS V1-V3)

Thu May 30 13:16:39 2019 +01:00
LENGTH : '247'
ACTION :[7] 'STARTUP'
DATABASE USER:[1] '/'
PRIVILEGE :[6] 'SYSDBA'
CLIENT USER:[6] 'oracle'
CLIENT TERMINAL:[7] 'UNKNOWN'
STATUS:[1] '0'
DBID:[0] ''
SESSIONID:[1] '0'
USERHOST:[15] 'frea.virtualbox'
CLIENT ADDRESS:[0] ''
ACTION NUMBER:[3] '138'

Well that could come in useful, but is there any auditing of operations once the database is up and running ?

Oracle supplied Unified Audit Policies

There are a number of policies that are set up by default :

select owner, object_name
from dba_objects
where object_type = 'UNIFIED AUDIT POLICY'
and oracle_maintained = 'Y'
order by object_name;

OWNER      OBJECT_NAME                                       
---------- --------------------------------------------------
SYS        ORA_ACCOUNT_MGMT                                  
SYS        ORA_CIS_RECOMMENDATIONS                           
SYS        ORA_DATABASE_PARAMETER                            
SYS        ORA_DV_AUDPOL                                     
SYS        ORA_DV_AUDPOL2                                    
SYS        ORA_LOGON_FAILURES                                
SYS        ORA_RAS_POLICY_MGMT                               
SYS        ORA_RAS_SESSION_MGMT                              
SYS        ORA_SECURECONFIG                                  

9 rows selected. 

We can confirm which of these are active by running :

select policy_name, user_name, 
    enabled_option, entity_name, entity_type, 
    success, failure
from audit_unified_enabled_policies
order by policy_name
/

POLICY_NAME          ENABLED_OPTION       ENTITY_NAME  ENTITY_TYPE  SUCCESS    FAILURE   
-------------------- -------------------- ------------ ------------ ---------- ----------
ORA_LOGON_FAILURES   ALL USERS            ALL USERS    USER         NO         YES       
ORA_SECURECONFIG     ALL USERS            ALL USERS    USER         YES        YES       

One point to note is that the ORA_LOGON_FAILURES policy is configured to record statements on failure and not on success.

ORA_LOGON_FAILURES

We can see which activities are covered by this policy by running :

select audit_option, audit_option_type 
from audit_unified_policies
where policy_name = 'ORA_LOGON_FAILURES';

AUDIT_OPTION         AUDIT_OPTION_TYPE 
-------------------- ------------------
LOGON                STANDARD ACTION   

From this, we can infer that this policy will record any failed logon attempts in the audit trail. Let’s test that hypothesis…

First off, I’m going to issue a failed login attempt :

sqlplus hr/thisisnotthepassword@xepdb1

SQL*Plus: Release 12.2.0.1.0 Production on Thu May 30 14:10:13 2019

Copyright (c) 1982, 2016, Oracle.  All rights reserved.

ERROR:
ORA-01017: invalid username/password; logon denied


Enter user-name: 

Right, that should do it. Now to check the audit trail in the database :

select event_timestamp, audit_type, dbusername, action_name, return_code
from unified_audit_trail uat
where unified_audit_policies = 'ORA_LOGON_FAILURES'
and uat.event_timestamp > systimestamp - ((1/24/60) *5) -- happened in the last 5 minutes
order by event_timestamp desc
/

Note here that the record includes the return code, which looks to be the Oracle Error that was generated.
Also note that, in a multitenant environment, you need to be in the relevant container when you query the audit trail.
In this example, I was attempting to login to the XEPDB1 PDB. I would not see this logon failure if I was querying the audit trail from the CDB.

The UNIFIED_AUDIT_TRAIL contains 99 columns and the relevance of each of them will vary according to what you’re looking at in the audit trail.
As usual, the columns themselves have been documented with comments which you can find by running :

select column_name, comments
from dba_col_comments
where owner = 'AUDSYS'
and table_name = 'UNIFIED_AUDIT_TRAIL'
order by column_name
/

COLUMN_NAME                    COMMENTS                                          
------------------------------ --------------------------------------------------
ACTION_NAME                    Name of the action executed by the user           
ADDITIONAL_INFO                Text comment on the audit trail entry             
APPLICATION_CONTEXTS           SemiColon seperate list of Application Context Nam
                               espace, Attribute, Value information in (APPCTX_NS
                               PACE,APPCTX_ATTRIBUTE=<value>) format             

AUDIT_OPTION                   Auditing option set with the audit statement      
AUDIT_TYPE                     Type of the Audit Record                          
AUTHENTICATION_TYPE            Type of Authentication for the session user       
CLIENT_IDENTIFIER              Client identifier in each Oracle session        
***snip***
XS_SESSIONID                   Real Application User Session Identifier          
XS_TARGET_PRINCIPAL_NAME       Target principal name in Real Application Security
                                operations                                       

XS_USER_NAME                   Real Application User name                        
ORA_SECURECONFIG

By contrast, ORA_SECURECONFIG audits a number of system privileges :

select audit_option, audit_option_type, object_schema
from audit_unified_policies
where policy_name = 'ORA_SECURECONFIG'
order by 1
/

AUDIT_OPTION                             AUDIT_OPTION_TYPE  OBJECT_SCHEMA                 
---------------------------------------- ------------------ ------------------------------
ADMINISTER KEY MANAGEMENT                SYSTEM PRIVILEGE   NONE                          
ALTER ANY PROCEDURE                      SYSTEM PRIVILEGE   NONE                          
ALTER ANY SQL TRANSLATION PROFILE        SYSTEM PRIVILEGE   NONE                          
ALTER ANY TABLE                          SYSTEM PRIVILEGE   NONE                          
ALTER DATABASE                           SYSTEM PRIVILEGE   NONE                          
ALTER DATABASE DICTIONARY                STANDARD ACTION    NONE                          
ALTER DATABASE LINK                      STANDARD ACTION    NONE                          
ALTER PLUGGABLE DATABASE                 STANDARD ACTION    NONE                          
ALTER PROFILE                            STANDARD ACTION    NONE                          
ALTER ROLE                               STANDARD ACTION    NONE                          
ALTER SYSTEM                             SYSTEM PRIVILEGE   NONE                          
ALTER USER                               STANDARD ACTION    NONE                          
AUDIT SYSTEM                             SYSTEM PRIVILEGE   NONE                          
BECOME USER                              SYSTEM PRIVILEGE   NONE                          
CREATE ANY JOB                           SYSTEM PRIVILEGE   NONE                          
CREATE ANY LIBRARY                       SYSTEM PRIVILEGE   NONE                          
CREATE ANY PROCEDURE                     SYSTEM PRIVILEGE   NONE                          
CREATE ANY SQL TRANSLATION PROFILE       SYSTEM PRIVILEGE   NONE                          
CREATE ANY TABLE                         SYSTEM PRIVILEGE   NONE                          
CREATE DATABASE LINK                     STANDARD ACTION    NONE                          
CREATE DIRECTORY                         STANDARD ACTION    NONE                          
CREATE EXTERNAL JOB                      SYSTEM PRIVILEGE   NONE                          
CREATE PLUGGABLE DATABASE                STANDARD ACTION    NONE                          
CREATE PROFILE                           STANDARD ACTION    NONE                          
CREATE PUBLIC SYNONYM                    SYSTEM PRIVILEGE   NONE                          
CREATE ROLE                              STANDARD ACTION    NONE                          
CREATE SQL TRANSLATION PROFILE           SYSTEM PRIVILEGE   NONE                          
CREATE USER                              SYSTEM PRIVILEGE   NONE                          
DROP ANY PROCEDURE                       SYSTEM PRIVILEGE   NONE                          
DROP ANY SQL TRANSLATION PROFILE         SYSTEM PRIVILEGE   NONE                          
DROP ANY TABLE                           SYSTEM PRIVILEGE   NONE                          
DROP DATABASE LINK                       STANDARD ACTION    NONE                          
DROP DIRECTORY                           STANDARD ACTION    NONE                          
DROP PLUGGABLE DATABASE                  STANDARD ACTION    NONE                          
DROP PROFILE                             STANDARD ACTION    NONE                          
DROP PUBLIC SYNONYM                      SYSTEM PRIVILEGE   NONE                          
DROP ROLE                                STANDARD ACTION    NONE                          
DROP USER                                SYSTEM PRIVILEGE   NONE                          
EXECUTE                                  OBJECT ACTION      SYS                           
EXECUTE                                  OBJECT ACTION      REMOTE_SCHEDULER_AGENT        
EXEMPT ACCESS POLICY                     SYSTEM PRIVILEGE   NONE                          
EXEMPT REDACTION POLICY                  SYSTEM PRIVILEGE   NONE                          
GRANT ANY OBJECT PRIVILEGE               SYSTEM PRIVILEGE   NONE                          
GRANT ANY PRIVILEGE                      SYSTEM PRIVILEGE   NONE                          
GRANT ANY ROLE                           SYSTEM PRIVILEGE   NONE                          
LOGMINING                                SYSTEM PRIVILEGE   NONE                          
PURGE DBA_RECYCLEBIN                     SYSTEM PRIVILEGE   NONE                          
SET ROLE                                 STANDARD ACTION    NONE                          
TRANSLATE ANY SQL                        SYSTEM PRIVILEGE   NONE   

It so happens that I’ve installed the HR demo application on this database by following Oracle’s instructions and running $ORACLE_HOME/demo/schema/human_resources/hr_main.sql as SYSTEM.

This policy has captured the DDL from this activity :

select scn, dbusername, system_privilege_used, action_name, object_schema, object_name,
from unified_audit_trail
where unified_audit_policies = 'ORA_SECURECONFIG'
and scn >= 2038863
order by event_timestamp
/

       SCN DBUSERNAME   SYSTEM_PRIVILEGE_USED                    ACTION_NAME          OBJECT_SCHEM OBJECT_NAME         
---------- ------------ ---------------------------------------- -------------------- ------------ --------------------
   2038863 SYSTEM       CREATE USER                              CREATE USER                       HR                  
   2038869 SYSTEM       ALTER USER                               ALTER USER                        HR                  
   2038874 SYSTEM       ALTER USER                               ALTER USER                        HR                  
   2038880 SYSTEM       GRANT ANY PRIVILEGE                      GRANT                                                 
   2038886 SYSTEM       GRANT ANY ROLE, GRANT ANY PRIVILEGE      GRANT                                                 
   2038897 SYSTEM       CREATE ANY TABLE                         CREATE TABLE         HR           REGIONS             
   2038910 SYSTEM       ALTER ANY TABLE                          ALTER TABLE          HR           REGIONS             
   2038923 SYSTEM       CREATE ANY TABLE                         CREATE TABLE         HR           COUNTRIES           
   2038929 SYSTEM       ALTER ANY TABLE                          ALTER TABLE          HR           COUNTRIES           
   2038937 SYSTEM       CREATE ANY TABLE                         CREATE TABLE         HR           LOCATIONS           
   2038949 SYSTEM       ALTER ANY TABLE                          ALTER TABLE          HR           LOCATIONS           
   2038962 SYSTEM       CREATE ANY TABLE                         CREATE TABLE         HR           DEPARTMENTS         
   2038976 SYSTEM       ALTER ANY TABLE                          ALTER TABLE          HR           DEPARTMENTS         
   2038988 SYSTEM       CREATE ANY TABLE                         CREATE TABLE         HR           JOBS                
   2039000 SYSTEM       ALTER ANY TABLE                          ALTER TABLE          HR           JOBS                
   2039016 SYSTEM       CREATE ANY TABLE                         CREATE TABLE         HR           EMPLOYEES           
   2039030 SYSTEM       ALTER ANY TABLE                          ALTER TABLE          HR           EMPLOYEES           
   2039036 SYSTEM       ALTER ANY TABLE                          ALTER TABLE          HR           DEPARTMENTS         
   2039052 SYSTEM       CREATE ANY TABLE                         CREATE TABLE         HR           JOB_HISTORY         
   2039068 SYSTEM       ALTER ANY TABLE                          ALTER TABLE          HR           JOB_HISTORY         
   2040134 SYSTEM       ALTER ANY TABLE                          ALTER TABLE          HR           DEPARTMENTS         
   2040179 SYSTEM       ALTER ANY TABLE                          ALTER TABLE          HR           DEPARTMENTS         
   2040303 SYSTEM       CREATE ANY PROCEDURE                     CREATE PROCEDURE     HR           SECURE_DML          
   2040318 SYSTEM       CREATE ANY PROCEDURE                     CREATE PROCEDURE     HR           ADD_JOB_HISTORY     
   2041099 SYSTEM       ALTER USER                               ALTER USER                        HR                  

25 rows selected. 

NOTE that the audit trail contains the System Change Number (SCN) as well as a timestamp, which can make it easier to select the part of the audit trail you are interested in at any given time.

As well as the details of who did what, the audit trail holds the SQL that was executed as part of this operation. For example, if we wanted to check what statement was run when the HR.REGIONS table was created, we could run :

select sql_text
from unified_audit_trail
where scn = 2038897;

SQL_TEXT                                                                        
--------------------------------------------------------------------------------
CREATE TABLE regions                                                            
    ( region_id      NUMBER                                                     
       CONSTRAINT  region_id_nn NOT NULL                                        
    , region_name    VARCHAR2(25)                                               
    )                            

Note that SCN is unique to a transaction. As we’re looking for a DDL statement, which is almost always contained in it’s own transaction (except when it isn’t), it’s probably not such a good idea to rely on it as a de facto Primary Key when you’re searching for DML statements, which may share a transaction ( and therefore an SCN) with several others. Obviously, an SCN is completely useless when you’re searching for SELECT statements.
Whilst we’re on the subject of predicates when selecting from the audit trail, it’s worth bearing in mind that the underlying table is actually partitioned by EVENT_TIMESTAMP. By default, these partitions cover one month each. Including EVENT_TIMESTAMP in the predicate will cause the optimizer to perform partition pruning so may help query performance.

Anyway, speaking of auditing DML, it’s time to introduce…

The Demo Application

We have an application to hold details of each manager who has won each of England’s major domestic trophies. The application is owned by MOTTY, who is a bit of a footie geek. We also have an application user called TEDDY who hasn’t been around long and still thinks that football didn’t exist before the Premier League…

create user motty identified by motty
/
alter user motty quota unlimited on users
/

grant connect, create table, create sequence, create procedure to motty
/

create user teddy identified by teddy
/

grant connect to teddy
/

The application objects are the WINNING_MANAGERS table…

create table winning_managers(
    wm_name varchar2(100) constraint wm_pk primary key,
    league_title number(4),
    fa_cup number(4),
    league_cup number(4))
/

comment on table winning_managers is 'Managers who have won a career treble and the years in which they first won each major English domestic trophy'
/

comment on column winning_managers.wm_name is 'The name of the manager'
/

comment on column winning_managers.league_title is 'The year the manager first won the Title'
/

comment on column winning_managers.fa_cup is 'The year the manager first won the FA Cup'
/

comment on column winning_managers.league_cup is 
    'The year the manager first won the League (Coca-Cola/Worthington/Carling/Capital One/EFL/Carabao) Cup'
/

…and a package…

create or replace package managers_api as
    procedure ins( 
        i_name winning_managers.wm_name%type, 
        i_title winning_managers.league_title%type, 
        i_fac winning_managers.fa_cup%type, 
        i_lc winning_managers.league_cup%type);
        
    procedure upd(
        i_name winning_managers.wm_name%type,
        i_title winning_managers.league_title%type, 
        i_fac winning_managers.fa_cup%type, 
        i_lc winning_managers.league_cup%type);
        
    procedure del( i_name winning_managers.wm_name%type);
end managers_api;
/

create or replace package body managers_api is

    procedure ins( 
        i_name winning_managers.wm_name%type, 
        i_title winning_managers.league_title%type, 
        i_fac winning_managers.fa_cup%type, 
        i_lc winning_managers.league_cup%type)
    is
    begin
        insert into winning_managers( wm_name, league_title, fa_cup, league_cup)
        values( i_name, i_title, i_fac, i_lc);
    end ins;
        
    procedure upd(
        i_name winning_managers.wm_name%type,
        i_title winning_managers.league_title%type, 
        i_fac winning_managers.fa_cup%type, 
        i_lc winning_managers.league_cup%type)
    is
    begin
        if i_name is null then
            raise_application_error(-20000, 'Manager name must be specified');
        end if;
        
        update winning_managers
        set league_title = nvl(i_title, league_title),
            fa_cup = nvl( i_fac, fa_cup),
            league_cup = nvl(i_lc, league_cup)
        where wm_name = i_name;
    end upd;
        
    procedure del(i_name winning_managers.wm_name%type)
    is
    begin
        if i_name is null then
            raise_application_error(-20000, 'Manager name must be specified');
        end if;
        delete from winning_managers
        where wm_name = i_name;
    end del;
end managers_api;
/

For reasons which will shortly become apparent, Teddy has been granted access to the application as follows :

grant select, insert, update, delete on winning_managers to teddy
/

grant execute on managers_api to teddy
/

Teddy is working on a project to identify all managers who have won each of the major English domestic trophies in their carrer.
He immediately takes advantage of these new privileges to add all of the Premier League winning managers to the application…

begin
    motty.managers_api.ins('SIR ALEX FERGUSON', 1993, 1994, 2006);
    motty.managers_api.ins('KENNY DALGLISH', 1995, null, 2012);
    motty.managers_api.ins('ARSENE WENGER', 1998, 1998, null);
    motty.managers_api.ins('JOSE MOURINHO', 2005, 2007, 2006);
    motty.managers_api.ins('CARLO ANCELOTTI', 2010, 2010, null);
    motty.managers_api.ins('ROBERTO MANCINI', 2012, 2011, null);
    motty.managers_api.ins('MANUEL PELLIGRINI', 2014, null, 2014);
    motty.managers_api.ins('CLAUDIO RANIERI', 2016, null, null);
    motty.managers_api.ins('ANTONIO CONTI', 2017, 2018, null);
    motty.managers_api.ins('PEP GUARDIOLA', 2018, 2019, 2018);

    commit;
end;
/

When Motty checks the table, he can see that there are a few problems :

select wm_name, league_title, fa_cup, league_cup
from winning_managers
/

WM_NAME              LEAGUE_TITLE     FA_CUP LEAGUE_CUP
-------------------- ------------ ---------- ----------
SIR ALEX FERGUSON            1993       1994       2006
KENNY DALGLISH               1995                  2012
ARSENE WENGER                1998       1998           
JOSE MOURINHO                2005       2007       2006
CARLO ANCELOTTI              2010       2010           
ROBERTO MANCINI              2012       2011           
MANUEL PELLIGRINI            2014                  2014
CLAUDIO RANIERI              2016                      
ANTONIO CONTI                2017       2018           
PEP GUARDIOLA                2018       2019       2018

10 rows selected. 

It looks like someone has not accounted for anything prior to the 1992/93 season.
Motty needs to find out who is doing this so that he can explain that the FA Cup has been going since 1872, the League since 1888, and the League Cup since 1961.

Auditing DML

In order to create a policy, you need to have the AUDIT SYSTEM or AUDIT_ADMIN role.
We’ll connect to the PDB as system and grant the role to MOTTY now :

grant audit_admin to motty;

Now, connected as MOTTY, we can create an audit policy for the table :

create audit policy wm_audit
    actions all on motty.winning_managers
    when 'sys_context(''userenv'', ''session_user'') not in (''SYS'', ''SYSTEM'')' evaluate per session
/

Audit created

…and enable it…

audit policy wm_audit
/

Audit succeeded.

Note that the when condition of the policy needs to be formatted as in this example. If you use the standard SQL method for escaping quotes – i.e. :

when q'['sys_context('userenv', 'session_user') not in ('SYS', 'SYSTEM')']' evaluate per session

you’ll find yourself on the wrong end of :

ORA-46368: Audit policy does not have a simple rule condition.

In terms of specifying which users should be included or excluded from this policy, you could re-create the it as follows :

noaudit policy wm_audit
/

drop audit policy wm_audit
/

create audit policy wm_audit
    actions all on motty.winning_managers
/

audit policy wm_audit except sys, system	
/

Either way, with the policy in place and enabled, the next time Teddy connects and checks the table…

select wm_name 
from motty.winning_managers
/

…MOTTY gets to see an entry in the audit log :


select dbusername, sql_text
from unified_audit_trail
where unified_audit_policies = 'WM_AUDIT'
/

DBUSERNAME           SQL_TEXT                                          
-------------------- --------------------------------------------------
TEDDY                select wm_name                                    
                     from motty.winning_managers                       

Looking at the policy, we can confirm that any activities by SYS and SYSTEM on this table are excluded from any logging under this policy :


select audit_condition
from audit_unified_policies
where policy_name = 'WM_AUDIT'
/

AUDIT_CONDITION                                                       
----------------------------------------------------------------------
sys_context('userenv', 'session_user') not in ('SYS', 'SYSTEM')

So when SYSTEM runs a query against the table :

select wm_name
from motty.winning_managers
where league_title is not null
and fa_cup is not null
and league_cup is not null
/

WM_NAME                                 
----------------------------------------
SIR ALEX FERGUSON
JOSE MOURINHO
PEP GUARDIOLA

It does not show up in the Audit Trail :

select event_timestamp, sql_text
from unified_audit_trail
where unified_audit_policies = 'WM_AUDIT'
and dbusername = 'SYSTEM'
/

no rows selected

At this point, you may be wondering why you don’t simply dispense with any journalling triggers you have hanging around in your application and just use the audit trail instead. Well, to illustrate one major difference in how these mechanisms might behave consider the following…

Teddy has overheard Motty grumbling in the office and decides to insert another record into the table …

insert into motty.winning_managers(wm_name, league_title, fa_cup, league_cup)
values('MOTTY', 1888, 1872, 1961);

…before having second thoughts…

rollback;

and checking that the record has indeed not been committed :

select league_title, fa_cup, league_cup
from motty.winning_managers
where wm_name = 'MOTTY'
/

no rows selected

Unfortunately, when Motty checks the audit trail again, Teddy is busted…

select dbusername, action_name, sql_text
from unified_audit_trail
where unified_audit_policies = 'WM_AUDIT'
and event_timestamp > systimestamp - ((1/24/60) * 10) 
order by event_timestamp
/

DBUSERNAME           ACTION_NAME          SQL_TEXT                                
-------------------- -------------------- ----------------------------------------
TEDDY                INSERT               insert into motty.winning_managers(wm_na
                                          me, league_title, fa_cup, league_cup)   
                                          values('MOTTY', 1888, 1872, 1961)       
                                                                                 

TEDDY                SELECT               select league_title, fa_cup, league_cup 
                                          from motty.winning_managers             
                                          where wm_name = 'MOTTY'                 

So, even though the INSERT statement was not committed, it still appears in the audit trail. Probably not the sort of thing you want from a journalling trigger.

DML from stored procedure calls

Teddy has now started to clean up the data using the MANAGERS_API package :

begin
    -- Correct the dates for Sir Alex Ferguson's first FA Cup and League Cup wins as a manager
    motty.managers_api.upd(i_name => 'SIR ALEX FERGUSON', i_title => null, i_fac => 1990, i_lc => 1992);
    
    -- Delete Kenny Dalglish record...
    motty.managers_api.del(i_name => 'KENNY DALGLISH');
    
    --Re-insert the record to reflect the fact that he has been knighted
    motty.managers_api.ins(i_name => 'SIR KENNY DALGLISH', i_title => 1986, i_fac => 1986, i_lc => 2012);
end;
/

commit;

We can see each of the DML statements resulting from these stored program unit calls. However, whilst the SQL_TEXT shows the bind variable placeholders, the bind variable values themseleves can be found in the SQL_BINDS column :

select dbusername, action_name, sql_text, sql_binds
from unified_audit_trail
where unified_audit_policies = 'WM_AUDIT'
and event_timestamp > systimestamp - ((1/24/60) * 10)
order by event_timestamp
/

DBUSERNAME      ACTION_NAME     SQL_TEXT                                                     SQL_BINDS                     
--------------- --------------- ------------------------------------------------------------ ------------------------------
TEDDY           UPDATE          UPDATE WINNING_MANAGERS SET LEAGUE_TITLE = NVL(:B4 , LEAGUE_  #1(0):  #2(4):1990 #3(4):1992
                                TITLE), FA_CUP = NVL( :B3 , FA_CUP), LEAGUE_CUP = NVL(:B2 ,   #4(17):SIR ALEX FERGUSON     
                                LEAGUE_CUP) WHERE WM_NAME = :B1                                                           

TEDDY           DELETE          DELETE FROM WINNING_MANAGERS WHERE WM_NAME = :B1             #1(14):KENNY DALGLISH        
TEDDY           INSERT          INSERT INTO WINNING_MANAGERS( WM_NAME, LEAGUE_TITLE, FA_CUP,  #1(18):SIR KENNY DALGLISH #2(
                                 LEAGUE_CUP) VALUES( :B4 , :B3 , :B2 , :B1 )                4):1986 #3(4):1986 #4(4):2012 

Realising that not completing the list of managers will annoy anyone who has read this far in the hope of finding out, Teddy knuckles down and finishes the job. Once he’s done, we can see the full list :

select wm_name, league_title, fa_cup, league_cup, 
    greatest( league_title, fa_cup, league_cup) as "Career Treble"
from motty.winning_managers
order by 5
/

MANAGER                        LEAGUE_TITLE       FA_CUP   LEAGUE_CUP Career Treble
------------------------------ ------------ ------------ ------------ -------------
Joe Mercer                             1968         1969         1961          1969
Bill Nicholson                         1961         1961         1971          1971
Don Revie                              1969         1972         1968          1972
Sir Alex Ferguson                      1993         1990         1992          1993
George Graham                          1989         1993         1987          1993
Jose Mourinho                          2005         2007         2006          2007
Sir Kenny Dalglish                     1986         1986         2012          2012
Pep Guardiola                          2018         2019         2018          2019

8 rows selected. 
Useful Links and Acknowledgements

As ever, the Oracle documentation is a mine of useful information.

The page for the AUDIT command is a useful starting point, as is the page for CREATE AUDIT POLICY.

As usual, Tim Hall’s Oracle Base has a number of useful articles on this topic, this one being of particular use to me in putting this post together.

Monitoring & Troubleshooting Oracle Cloud at Customer

Syed Jaffar - Fri, 2019-05-31 08:33
The prime advantage of cloud at customer is to deliver all cloud benefits at your data center. Oracle cloud at customer provides the same. When Oracle cloud at customer is chosen, it is Oracle who is responsible to install, configure and manage the software and hardware required to run Oracle cloud at customer. However, customers are responsible for monitoring and troubleshooting resources instantiate on Oracle cloud at customer.

Customers are required to understand the difference between system and user-space monitoring and the tools required. The Oracle cloud at customer subscription consists of the below components:


  • Hardware and Software
  • Control panel software
  • The Oracle Advanced Support Gateway (OASW)
  • The Oracle Cloud Service

System monitoring vs User Space Monitoring

Typically, Oracle cloud at customer is monitored at two level:
  1. System
  2. User space
Oracle monitors the system and the customer monitors the user space.



























The systems or machine resources, such as : hardware, control panel and cloud services on Oracle cloud at Customer are managed by Oracle remotely using Oracle Advanced Gateway. The OAGW is only used and accessible to Oracle authorized personnel.

The user space components consists the below:


  • Oracle Cloud accounts
  • VMs instances on IaaS or PaaS
  • DB that are provisioned within the PaaS subscription
  • Applications (Oracle or any third-party)
Oracle manages the following hardware and software components:
  • Ethernet switches
  • Power Supplies
  • Exadata Storage Servers
  • Hypervisor running on the physical servers
Customers can assign administrators to manage cloud accounts. Customers also are free to use any external monitoring agents to monitor user-space components.


Partner Webcast – Oracle APEX on Cloud: architecture, functionality and features

Oracle Application Express (APEX) is a low-code development platform that enables you to build scalable, secure enterprise apps, with world-class features, that can be deployed anywhere. With...

We share our skills to maximize your revenue!
Categories: DBA Blogs

Securely store passwords in PostgreSQL

Yann Neuhaus - Fri, 2019-05-31 00:25

Every application somehow needs to deal with passwords. Some use external authentication methods such as ldap, others us the the framework the database already provides and create users and roles. But it is also not uncommon that applications implement their own concept for managing users. If an application does this it should be done the right way and passwords should never be stored in plain text in the database. PostgreSQL comes with a handy extension that supports you with that.

You might be already aware that PostgreSQL comes with a lot of additional modules by default. One of these modules is pgcrypto and it can be used for the use case described above: En- and decryption of strings so you do not have to implement that on your own. Lets start with a simple table which contains usernames and their passwords:

postgres=# create table app_users ( id int generated always as identity ( cache 10 ) primary key
postgres(#                        , username text not null unique
postgres(#                        , password text not null
postgres(#                        );
CREATE TABLE
postgres=# \d app_users
                         Table "public.app_users"
  Column  |  Type   | Collation | Nullable |           Default            
----------+---------+-----------+----------+------------------------------
 id       | integer |           | not null | generated always as identity
 username | text    |           | not null | 
 password | text    |           | not null | 
Indexes:
    "app_users_pkey" PRIMARY KEY, btree (id)
    "app_users_username_key" UNIQUE CONSTRAINT, btree (username)
postgres=# 

Both, the username and password columns are implement as plain text. If you keep it like that and just insert data the password of course will be stored as plain text. So how can we use pgcrypto to improve that? Obviously the first step is to install the extension:

postgres=# create extension pgcrypto;
CREATE EXTENSION
postgres=# \dx
                   List of installed extensions
    Name    | Version |   Schema   |         Description          
------------+---------+------------+------------------------------
 pg_prewarm | 1.2     | public     | prewarm relation data
 pgcrypto   | 1.3     | public     | cryptographic functions
 plpgsql    | 1.0     | pg_catalog | PL/pgSQL procedural language
(3 rows)

Btw: There is a catalog view which you can use to list all available extensions:

postgres=# \d pg_available_extensions;
         View "pg_catalog.pg_available_extensions"
      Column       | Type | Collation | Nullable | Default 
-------------------+------+-----------+----------+---------
 name              | name |           |          | 
 default_version   | text |           |          | 
 installed_version | text | C         |          | 
 comment           | text |           |          | 

postgres=# select * from pg_available_extensions limit 3;
  name   | default_version | installed_version |                comment                 
---------+-----------------+-------------------+----------------------------------------
 plpgsql | 1.0             | 1.0               | PL/pgSQL procedural language
 plperl  | 1.0             |                   | PL/Perl procedural language
 plperlu | 1.0             |                   | PL/PerlU untrusted procedural language
(3 rows)

The function to use (provided by the pgcrypto module) for encrypting strings is crypt(). This function takes two arguments:

  • The actual string to encrypt
  • The salt to use (a random value) for encrpyption

Adding a user with an encrypted password is as easy as:

postgres=# insert into app_users (username, password) 
postgres-#        values ( 'myuser', crypt('mypassword', gen_salt('bf')) );
INSERT 0 1

In this case we used the Blowfish algorithm to generate the salt. You can also use md5, xdes and des.

When we look at the password for our user we will see that it is not plain text anymore:

postgres=# select password from app_users where username = 'myuser';
                           password                           
--------------------------------------------------------------
 $2a$06$8wu4VWVubv/RBYBSuj.1TOojPm0q7FkRwuDSoW0OTOC6FzBGEslIC
(1 row)

This is for the encryption part. For comparing this encrypted string against the plain text version of the string we use the crypt() function again:

postgres=# select (password = crypt('mypassword', password)) AS pwd_match 
postgres-#   from app_users
postgres-#  where username = 'myuser';
 pwd_match 
-----------
 t
(1 row)

Providing the wrong password of course returns false:

postgres=# select (password = crypt('Xmypassword', password)) AS pwd_match 
  from app_users
 where username = 'myuser';
 pwd_match 
-----------
 f
(1 row)

So finally, it is not much you need to do for storing encrypted strings in PostgreSQL. Just use it.

Cet article Securely store passwords in PostgreSQL est apparu en premier sur Blog dbi services.

Pages

Subscribe to Oracle FAQ aggregator