Databases is the collection of logically related databases connected through
When we design a
distributed database system, We mainly used two strategies. Fragmentation and
In fragmentation, We
fragment a single or multiple database tables into different pieces and place
these different pieces on different sites.
Advantage is to
increased local access es and efficiency.
But the problems,
which can occur in fragmentation is to select the right fragment size, the
placement of fragment that where a segment should be placed. When a global user
want to get consolidated data from different sites then complex query
processing increase the complexity of this technique.
In Replication, We
replicate a single or multiple database tables on different sites.
Its chief advantage
is to reduce remote access. A user can go to any site of the company and get
his/her data without remotely connected to his/her nearby living site.
disadvantage is space. As a record is duplicating at different sites so no
doubt there will be memory utilization and by reducing remote accesses we
degrade our services for local users. For example Site A has its concerned data
with size 2 GB, Site B has the data of 1 GB, Site 3 has the data of 4 GB. Then
total data will stored at Site A will have 7 GB. This will happen only when we
replicate all databases at different sites. We can replicate some tables but
here I am showing that how much memory will use at Site A when we replicate all
our databases at different sites.
In this paper, I
proposed a new strategy by getting rid of both Fragmentation and Replication. I
call this strategy a Direct Method.
applicable funding agency here. If
none, delete this text box.
Consider, We have a
business to supply milk into the homes in different cities. Let’s say,
Currently this business is working in Sargodha, Lahore, Rawaplindi and Karachi
and head office is in Islamabad. So, we want to design a distributed database
so that in every city it manage its own customers, total supply, total income
etc. And from head office, we may want to see the consolidated data from
Because the data
which we are storing at different sites is same in nature so what we do, we
design one database by using all techniques which we see in designing
Centralized database i.e (Conceptual design, Logical design, Physical design)
and when we made such database then we simply copy this empty database at
different sites by using some naming conventions.
Naming Conventions in Direct Method
Let’s say the
database which we designed is MILK and its tables are Customers, Sold_Milk,
etc. So for any Branch we put Branch Name before database name, before table
names, and before the attribute names.
Let’s say for
Attributes in Tables:-
Sargodha_Milk_Quantity, Sargodha_Period ….etc
Database Name:- Rawalpindi_MILK
Attributes in Tables:-
Rawalpindi_Milk_Quantity, Rawalpindi_Period ….etc
And same implies for
all other branches.
Second thing which we use in this strategy is
to store a lookup table at every site which states all other branches.
At Sargodha Branch
Similarly, at Rawalpindi Branch
applies for all other branches.
where are we now? We have designed a database. We copied the database by using
Naming conventions which mentioned above. And then we placed respective
databases at sites. And also we created lookup tables at each site, stating all
other sites. Now, when a customer comes for registration in Sargodha, the query
used will be
Sargodha_Customer (Sargodha_Customer_ID, Sargodha_Customer_Name,
Values (10001,”Ahsan Khan”, “Salanwali”,03001234567)
Only Branch Name will be written before the Table and attributes name and query
will be exactly applied at database because the database which lied at Sargodha
branch is the same. All other local queries at this site and all other sites
will be executed by this manner.
If we continue with the
same example which is given in the section of Local Access, Then if someone
wants to register in Sargodha for Rawaplindi branch. Then a module which we
called as an agent, extract the branch name from lookup table and do the same
thing i.e (Placed Branch name before the table and attributes names) and make a
Rawalpindi_Customer (Rawalpindi _Customer_ID, Rawalpindi _Customer_Name,
Rawalpindi _Address, Rawalpindi
_Mobile_No) Values (20001,”Ishaq Khan”, “Pindi”,03002345678)
And execute the query
remotely. Any kind of query can be executed remotely by just using simple
Now, Let’s talk about how
queries will be processed via global access. Consider, we want to see the total
quantity o milk sold at different sites in January, 2017. The query which
originate from head office is
Where Sargodha_Period=”January, 2017″
From Rawalpindi _Sold_Milk
Where Rawalpindi _Period=”January, 2017″
This query will be made for all sites like this
by looking up from look up table and processed at all the sites.
When these queries execute at different site
and we received the data like
Upto so on like these from Lahore, Karachi
We present these as a consolidated data by
adding all the quantities. But what if data received in multiple sites is not
in this domain. For example data received from sites like
As we know the data belongs to different
domains so in this case we will not add the quantities. We take union of tables
and present that data to global user. I talk more about this issue in
Use of Programming
The Direct Method contains the use of lots of
programming. We use some programming in strategy by getting rid from
Fragmentation and Replication.
In fact, The Programming used here is not very
complex. We just use simple algorithms to interpret the string. I illustrate it
by using simple example.
The input to algorithm is the query
And Lookup table will
For i=1 to count(table)
For j=1 to j