SQL query return data from multiple tables Ask Question

SQL query return data from multiple tables Ask Question

I would like to know the following:

  • how to get data from multiple tables in my database?
  • what types of methods are there to do this?
  • what are joins and unions and how are they different from one another?
  • When should I use each one compared to the others?

I am planning to use this in my (for example - PHP) application, but don't want to run multiple queries against the database, what options do I have to get data from multiple tables in a single query?

Note: I am writing this as I would like to be able to link to a well written guide on the numerous questions that I constantly come across in the PHP queue, so I can link to this for further detail when I post an answer.

The answers cover off the following:

  1. Part 1 - Joins and Unions
  2. Part 2 - Subqueries
  3. Part 3 - Tricks and Efficient Code
  4. Part 4 - Subqueries in the From Clause
  5. Part 5 - Mixed Bag of John's Tricks

ベストアンサー1

Part 1 - Joins and Unions

This answer covers:

  1. Part 1
    • Joining two or more tables using an inner join (See the wikipedia entry for additional info)
    • How to use a union query
    • Left and Right Outer Joins (this stackOverflow answer is excellent to describe types of joins)
    • Intersect queries (and how to reproduce them if your database doesn't support them) - this is a function of SQL-Server (see info) and part of the reason I wrote this whole thing in the first place.
  2. Part 2
    • Subqueries - what they are, where they can be used and what to watch out for
    • Cartesian joins AKA - Oh, the misery!

There are a number of ways to retrieve data from multiple tables in a database. In this answer, I will be using ANSI-92 join syntax. This may be different to a number of other tutorials out there which use the older ANSI-89 syntax (and if you are used to 89, may seem much less intuitive - but all I can say is to try it) as it is much easier to understand when the queries start getting more complex. Why use it? Is there a performance gain? The short answer is no, but it is easier to read once you get used to it. It is easier to read queries written by other folks using this syntax.

I am also going to use the concept of a small caryard which has a database to keep track of what cars it has available. The owner has hired you as his IT Computer guy and expects you to be able to drop him the data that he asks for at the drop of a hat.

I have made a number of lookup tables that will be used by the final table. This will give us a reasonable model to work from. To start off, I will be running my queries against an example database that has the following structure. I will try to think of common mistakes that are made when starting out and explain what goes wrong with them - as well as of course showing how to correct them.

The first table is simply a color listing so that we know what colors we have in the car yard.

mysql> create table colors(id int(3) not null auto_increment primary key, 
    -> color varchar(15), paint varchar(10));
Query OK, 0 rows affected (0.01 sec)

mysql> show columns from colors;
+-------+-------------+------+-----+---------+----------------+
| Field | Type        | Null | Key | Default | Extra          |
+-------+-------------+------+-----+---------+----------------+
| id    | int(3)      | NO   | PRI | NULL    | auto_increment |
| color | varchar(15) | YES  |     | NULL    |                |
| paint | varchar(10) | YES  |     | NULL    |                |
+-------+-------------+------+-----+---------+----------------+
3 rows in set (0.01 sec)

mysql> insert into colors (color, paint) values ('Red', 'Metallic'), 
    -> ('Green', 'Gloss'), ('Blue', 'Metallic'), 
    -> ('White' 'Gloss'), ('Black' 'Gloss');
Query OK, 5 rows affected (0.00 sec)
Records: 5  Duplicates: 0  Warnings: 0

mysql> select * from colors;
+----+-------+----------+
| id | color | paint    |
+----+-------+----------+
|  1 | Red   | Metallic |
|  2 | Green | Gloss    |
|  3 | Blue  | Metallic |
|  4 | White | Gloss    |
|  5 | Black | Gloss    |
+----+-------+----------+
5 rows in set (0.00 sec)

The brands table identifies the different brands of the cars out caryard could possibly sell.

mysql> create table brands (id int(3) not null auto_increment primary key, 
    -> brand varchar(15));
Query OK, 0 rows affected (0.01 sec)

mysql> show columns from brands;
+-------+-------------+------+-----+---------+----------------+
| Field | Type        | Null | Key | Default | Extra          |
+-------+-------------+------+-----+---------+----------------+
| id    | int(3)      | NO   | PRI | NULL    | auto_increment |
| brand | varchar(15) | YES  |     | NULL    |                |
+-------+-------------+------+-----+---------+----------------+
2 rows in set (0.01 sec)

mysql> insert into brands (brand) values ('Ford'), ('Toyota'), 
    -> ('Nissan'), ('Smart'), ('BMW');
Query OK, 5 rows affected (0.00 sec)
Records: 5  Duplicates: 0  Warnings: 0

mysql> select * from brands;
+----+--------+
| id | brand  |
+----+--------+
|  1 | Ford   |
|  2 | Toyota |
|  3 | Nissan |
|  4 | Smart  |
|  5 | BMW    |
+----+--------+
5 rows in set (0.00 sec)

モデル テーブルはさまざまな種類の車をカバーしますが、実際の車のモデルではなく、さまざまな車の種類を使用する方が簡単になります。

mysql> create table models (id int(3) not null auto_increment primary key, 
    -> model varchar(15));
Query OK, 0 rows affected (0.01 sec)

mysql> show columns from models;
+-------+-------------+------+-----+---------+----------------+
| Field | Type        | Null | Key | Default | Extra          |
+-------+-------------+------+-----+---------+----------------+
| id    | int(3)      | NO   | PRI | NULL    | auto_increment |
| model | varchar(15) | YES  |     | NULL    |                |
+-------+-------------+------+-----+---------+----------------+
2 rows in set (0.00 sec)

mysql> insert into models (model) values ('Sports'), ('Sedan'), ('4WD'), ('Luxury');
Query OK, 4 rows affected (0.00 sec)
Records: 4  Duplicates: 0  Warnings: 0

mysql> select * from models;
+----+--------+
| id | model  |
+----+--------+
|  1 | Sports |
|  2 | Sedan  |
|  3 | 4WD    |
|  4 | Luxury |
+----+--------+
4 rows in set (0.00 sec)

そして最後に、これらすべてのテーブルを結び付けるテーブルがあります。ID フィールドは、実際には車を識別するために使用される固有のロット番号です。

mysql> create table cars (id int(3) not null auto_increment primary key, 
    -> color int(3), brand int(3), model int(3));
Query OK, 0 rows affected (0.01 sec)

mysql> show columns from cars;
+-------+--------+------+-----+---------+----------------+
| Field | Type   | Null | Key | Default | Extra          |
+-------+--------+------+-----+---------+----------------+
| id    | int(3) | NO   | PRI | NULL    | auto_increment |
| color | int(3) | YES  |     | NULL    |                |
| brand | int(3) | YES  |     | NULL    |                |
| model | int(3) | YES  |     | NULL    |                |
+-------+--------+------+-----+---------+----------------+
4 rows in set (0.00 sec)

mysql> insert into cars (color, brand, model) values (1,2,1), (3,1,2), (5,3,1), 
    -> (4,4,2), (2,2,3), (3,5,4), (4,1,3), (2,2,1), (5,2,3), (4,5,1);
Query OK, 10 rows affected (0.00 sec)
Records: 10  Duplicates: 0  Warnings: 0

mysql> select * from cars;
+----+-------+-------+-------+
| id | color | brand | model |
+----+-------+-------+-------+
|  1 |     1 |     2 |     1 |
|  2 |     3 |     1 |     2 |
|  3 |     5 |     3 |     1 |
|  4 |     4 |     4 |     2 |
|  5 |     2 |     2 |     3 |
|  6 |     3 |     5 |     4 |
|  7 |     4 |     1 |     3 |
|  8 |     2 |     2 |     1 |
|  9 |     5 |     2 |     3 |
| 10 |     4 |     5 |     1 |
+----+-------+-------+-------+
10 rows in set (0.00 sec)

これにより、以下のさまざまな種類の結合の例をカバーするのに十分なデータ (そう願っています) が得られ、また、それらを価値あるものにするのに十分なデータも得られます。

そこで本題に入ると、上司は自分が所有するすべてのスポーツカーのIDを知りたいのです。

これは単純な 2 つのテーブル結合です。モデルを識別するテーブルと、在庫のあるテーブルがあります。ご覧のとおり、テーブルmodelの列のデータは、テーブルの列carsに関連しています。モデル テーブルに ID があることがわかったので、結合を記述しましょう。modelscars1Sports

select
    ID,
    model
from
    cars
        join models
            on model=ID

このクエリは良さそうですね。2 つのテーブルを特定し、必要な情報が含まれていることを確認して、結合する列を正しく識別する結合を使用しています。

ERROR 1052 (23000): Column 'ID' in field list is ambiguous

ああ、大変!最初のクエリでエラーが発生しました!はい、これは大正解です。クエリには確かに正しい列が含まれていますが、その一部は両方のテーブルに存在するため、データベースは実際にどの列を意味し、どこにあるかについて混乱しています。この問題を解決するには 2 つの方法があります。1 つ目はシンプルで、tableName.columnName次のように使用してデータベースに正確に意味を伝えることができます。

select
    cars.ID,
    models.model
from
    cars
        join models
            on cars.model=models.ID

+----+--------+
| ID | model  |
+----+--------+
|  1 | Sports |
|  3 | Sports |
|  8 | Sports |
| 10 | Sports |
|  2 | Sedan  |
|  4 | Sedan  |
|  5 | 4WD    |
|  7 | 4WD    |
|  9 | 4WD    |
|  6 | Luxury |
+----+--------+
10 rows in set (0.00 sec)

もう 1 つはおそらくより頻繁に使用されるもので、テーブル エイリアスと呼ばれます。この例のテーブルには短くてシンプルな名前が付けられていますが、次のような名前を入力するとKPI_DAILY_SALES_BY_DEPARTMENTすぐに飽きてしまうので、簡単な方法は次のようにテーブルにニックネームを付けることです。

select
    a.ID,
    b.model
from
    cars a
        join models b
            on a.model=b.ID

さて、リクエストに戻ります。ご覧のとおり、必要な情報は揃っていますが、要求されていない情報も含まれているため、要求されたスポーツカーのみを取得するには、ステートメントに where 句を含める必要があります。テーブル名を何度も使用するよりもテーブル エイリアス メソッドの方が好みなので、この時点からはこのメソッドを使用します。

明らかに、クエリに where 句を追加する必要があります。 スポーツカーは、 またはID=1のいずれかで識別できますmodel='Sports'。 ID はインデックス化されており、主キーであるため (入力が少なくて済むため)、クエリでこれを使用しましょう。

select
    a.ID,
    b.model
from
    cars a
        join models b
            on a.model=b.ID
where
    b.ID=1

+----+--------+
| ID | model  |
+----+--------+
|  1 | Sports |
|  3 | Sports |
|  8 | Sports |
| 10 | Sports |
+----+--------+
4 rows in set (0.00 sec)

ビンゴ!上司は満足しています。もちろん、上司なので、自分が頼んだものに満足することは決してなく、彼は情報を見てから、「色も欲しい」と言います。

さて、クエリの大部分はすでに記述されていますが、3 番目のテーブルである colors を使用する必要があります。現在、メインの情報テーブルにcarsは車の色 ID が格納されており、これが colors ID 列にリンクされています。したがって、元のテーブルと同様の方法で、3 番目のテーブルを結合できます。

select
    a.ID,
    b.model
from
    cars a
        join models b
            on a.model=b.ID
        join colors c
            on a.color=c.ID
where
    b.ID=1

+----+--------+
| ID | model  |
+----+--------+
|  1 | Sports |
|  3 | Sports |
|  8 | Sports |
| 10 | Sports |
+----+--------+
4 rows in set (0.00 sec)

Damn, although the table was correctly joined and the related columns were linked, we forgot to pull in the actual information from the new table that we just linked.

select
    a.ID,
    b.model,
    c.color
from
    cars a
        join models b
            on a.model=b.ID
        join colors c
            on a.color=c.ID
where
    b.ID=1

+----+--------+-------+
| ID | model  | color |
+----+--------+-------+
|  1 | Sports | Red   |
|  8 | Sports | Green |
| 10 | Sports | White |
|  3 | Sports | Black |
+----+--------+-------+
4 rows in set (0.00 sec)

Right, that's the boss off our back for a moment. Now, to explain some of this in a little more detail. As you can see, the from clause in our statement links our main table (I often use a table that contains information rather than a lookup or dimension table. The query would work just as well with the tables all switched around, but make less sense when we come back to this query to read it in a few months time, so it is often best to try to write a query that will be nice and easy to understand - lay it out intuitively, use nice indenting so that everything is as clear as it can be. If you go on to teach others, try to instill these characteristics in their queries - especially if you will be troubleshooting them.

It is entirely possible to keep linking more and more tables in this manner.

select
    a.ID,
    b.model,
    c.color
from
    cars a
        join models b
            on a.model=b.ID
        join colors c
            on a.color=c.ID
        join brands d
            on a.brand=d.ID
where
    b.ID=1

While I forgot to include a table where we might want to join more than one column in the join statement, here is an example. If the models table had brand-specific models and therefore also had a column called brand which linked back to the brands table on the ID field, it could be done as this:

select
    a.ID,
    b.model,
    c.color
from
    cars a
        join models b
            on a.model=b.ID
        join colors c
            on a.color=c.ID
        join brands d
            on a.brand=d.ID
            and b.brand=d.ID
where
    b.ID=1

You can see, the query above not only links the joined tables to the main cars table, but also specifies joins between the already joined tables. If this wasn't done, the result is called a cartesian join - which is dba speak for bad. A cartesian join is one where rows are returned because the information doesn't tell the database how to limit the results, so the query returns all the rows that fit the criteria.

So, to give an example of a cartesian join, lets run the following query:

select
    a.ID,
    b.model
from
    cars a
        join models b

+----+--------+
| ID | model  |
+----+--------+
|  1 | Sports |
|  1 | Sedan  |
|  1 | 4WD    |
|  1 | Luxury |
|  2 | Sports |
|  2 | Sedan  |
|  2 | 4WD    |
|  2 | Luxury |
|  3 | Sports |
|  3 | Sedan  |
|  3 | 4WD    |
|  3 | Luxury |
|  4 | Sports |
|  4 | Sedan  |
|  4 | 4WD    |
|  4 | Luxury |
|  5 | Sports |
|  5 | Sedan  |
|  5 | 4WD    |
|  5 | Luxury |
|  6 | Sports |
|  6 | Sedan  |
|  6 | 4WD    |
|  6 | Luxury |
|  7 | Sports |
|  7 | Sedan  |
|  7 | 4WD    |
|  7 | Luxury |
|  8 | Sports |
|  8 | Sedan  |
|  8 | 4WD    |
|  8 | Luxury |
|  9 | Sports |
|  9 | Sedan  |
|  9 | 4WD    |
|  9 | Luxury |
| 10 | Sports |
| 10 | Sedan  |
| 10 | 4WD    |
| 10 | Luxury |
+----+--------+
40 rows in set (0.00 sec)

Good god, that's ugly. However, as far as the database is concerned, it is exactly what was asked for. In the query, we asked for for the ID from cars and the model from models. However, because we didn't specify how to join the tables, the database has matched every row from the first table with every row from the second table.

Okay, so the boss is back, and he wants more information again. I want the same list, but also include 4WDs in it.

This however, gives us a great excuse to look at two different ways to accomplish this. We could add another condition to the where clause like this:

select
    a.ID,
    b.model,
    c.color
from
    cars a
        join models b
            on a.model=b.ID
        join colors c
            on a.color=c.ID
        join brands d
            on a.brand=d.ID
where
    b.ID=1
    or b.ID=3

While the above will work perfectly well, lets look at it differently, this is a great excuse to show how a union query will work.

We know that the following will return all the Sports cars:

select
    a.ID,
    b.model,
    c.color
from
    cars a
        join models b
            on a.model=b.ID
        join colors c
            on a.color=c.ID
        join brands d
            on a.brand=d.ID
where
    b.ID=1

And the following would return all the 4WDs:

select
    a.ID,
    b.model,
    c.color
from
    cars a
        join models b
            on a.model=b.ID
        join colors c
            on a.color=c.ID
        join brands d
            on a.brand=d.ID
where
    b.ID=3

So by adding a union all clause between them, the results of the second query will be appended to the results of the first query.

select
    a.ID,
    b.model,
    c.color
from
    cars a
        join models b
            on a.model=b.ID
        join colors c
            on a.color=c.ID
        join brands d
            on a.brand=d.ID
where
    b.ID=1
union all
select
    a.ID,
    b.model,
    c.color
from
    cars a
        join models b
            on a.model=b.ID
        join colors c
            on a.color=c.ID
        join brands d
            on a.brand=d.ID
where
    b.ID=3

+----+--------+-------+
| ID | model  | color |
+----+--------+-------+
|  1 | Sports | Red   |
|  8 | Sports | Green |
| 10 | Sports | White |
|  3 | Sports | Black |
|  5 | 4WD    | Green |
|  7 | 4WD    | White |
|  9 | 4WD    | Black |
+----+--------+-------+
7 rows in set (0.00 sec)

As you can see, the results of the first query are returned first, followed by the results of the second query.

In this example, it would of course have been much easier to simply use the first query, but union queries can be great for specific cases. They are a great way to return specific results from tables from tables that aren't easily joined together - or for that matter completely unrelated tables. There are a few rules to follow however.

  • The column types from the first query must match the column types from every other query below.
  • The names of the columns from the first query will be used to identify the entire set of results.
  • The number of columns in each query must be the same.

Now, you might be wondering what the difference is between using union and union all. A union query will remove duplicates, while a union all will not. This does mean that there is a small performance hit when using union over union all but the results may be worth it - I won't speculate on that sort of thing in this though.

On this note, it might be worth noting some additional notes here.

  • If we wanted to order the results, we can use an order by but you can't use the alias anymore. In the query above, appending an order by a.ID would result in an error - as far as the results are concerned, the column is called ID rather than a.ID - even though the same alias has been used in both queries.
  • We can only have one order by statement, and it must be as the last statement.

For the next examples, I am adding a few extra rows to our tables.

I have added Holden to the brands table. I have also added a row into cars that has the color value of 12 - which has no reference in the colors table.

Okay, the boss is back again, barking requests out - *I want a count of each brand we carry and the number of cars in it!` - Typical, we just get to an interesting section of our discussion and the boss wants more work.

Rightyo, so the first thing we need to do is get a complete listing of possible brands.

select
    a.brand
from
    brands a

+--------+
| brand  |
+--------+
| Ford   |
| Toyota |
| Nissan |
| Smart  |
| BMW    |
| Holden |
+--------+
6 rows in set (0.00 sec)

Now, when we join this to our cars table we get the following result:

select
    a.brand
from
    brands a
        join cars b
            on a.ID=b.brand
group by
    a.brand

+--------+
| brand  |
+--------+
| BMW    |
| Ford   |
| Nissan |
| Smart  |
| Toyota |
+--------+
5 rows in set (0.00 sec)

Which is of course a problem - we aren't seeing any mention of the lovely Holden brand I added.

This is because a join looks for matching rows in both tables. As there is no data in cars that is of type Holden it isn't returned. This is where we can use an outer join. This will return all the results from one table whether they are matched in the other table or not:

select
    a.brand
from
    brands a
        left outer join cars b
            on a.ID=b.brand
group by
    a.brand

+--------+
| brand  |
+--------+
| BMW    |
| Ford   |
| Holden |
| Nissan |
| Smart  |
| Toyota |
+--------+
6 rows in set (0.00 sec)

Now that we have that, we can add a lovely aggregate function to get a count and get the boss off our backs for a moment.

select
    a.brand,
    count(b.id) as countOfBrand
from
    brands a
        left outer join cars b
            on a.ID=b.brand
group by
    a.brand

+--------+--------------+
| brand  | countOfBrand |
+--------+--------------+
| BMW    |            2 |
| Ford   |            2 |
| Holden |            0 |
| Nissan |            1 |
| Smart  |            1 |
| Toyota |            5 |
+--------+--------------+
6 rows in set (0.00 sec)

And with that, away the boss skulks.

ここで、これをもう少し詳しく説明すると、外部結合は またはleftタイプになりますright。Left または Right は、どのテーブルが完全に含められるかを定義します。A はleft outer join左側のテーブルのすべての行を含めますが、(ご想像のとおり) a はright outer join右側のテーブルのすべての結果を結果に取り込みます。

一部のデータベースでは、両方のfull outer joinテーブルから結果 (一致するかどうかに関係なく) を返す が許可されますが、これはすべてのデータベースでサポートされているわけではありません。

さて、この時点で、クエリで結合タイプをマージできるかどうか疑問に思っていると思いますが、答えは「はい、もちろん可能です」です。

select
    b.brand,
    c.color,
    count(a.id) as countOfBrand
from
    cars a
        right outer join brands b
            on b.ID=a.brand
        join colors c
            on a.color=c.ID
group by
    a.brand,
    c.color

+--------+-------+--------------+
| brand  | color | countOfBrand |
+--------+-------+--------------+
| Ford   | Blue  |            1 |
| Ford   | White |            1 |
| Toyota | Black |            1 |
| Toyota | Green |            2 |
| Toyota | Red   |            1 |
| Nissan | Black |            1 |
| Smart  | White |            1 |
| BMW    | Blue  |            1 |
| BMW    | White |            1 |
+--------+-------+--------------+
9 rows in set (0.00 sec)

では、なぜ期待した結果にならないのでしょうか? これは、車からブランドへの外部結合を選択したものの、色への結合では指定されていなかったためです。そのため、この特定の結合では、両方のテーブルで一致する結果のみが返されます。

期待した結果を得るために機能するクエリは次のとおりです。

select
    a.brand,
    c.color,
    count(b.id) as countOfBrand
from
    brands a
        left outer join cars b
            on a.ID=b.brand
        left outer join colors c
            on b.color=c.ID
group by
    a.brand,
    c.color

+--------+-------+--------------+
| brand  | color | countOfBrand |
+--------+-------+--------------+
| BMW    | Blue  |            1 |
| BMW    | White |            1 |
| Ford   | Blue  |            1 |
| Ford   | White |            1 |
| Holden | NULL  |            0 |
| Nissan | Black |            1 |
| Smart  | White |            1 |
| Toyota | NULL  |            1 |
| Toyota | Black |            1 |
| Toyota | Green |            2 |
| Toyota | Red   |            1 |
+--------+-------+--------------+
11 rows in set (0.00 sec)

ご覧のとおり、クエリには 2 つの外部結合があり、結果は期待どおりに出力されています。

さて、他の種類の結合についてはどうでしょうか? 交差についてはどうでしょうか?

すべてのデータベースがサポートしているわけではありませんintersectionが、ほとんどすべてのデータベースでは、結合 (または少なくとも適切に構造化された where ステートメント) を通じて交差を作成できます。

交差は、union上で説明した結合に多少似たタイプの結合ですが、違いは、結合によって結合されたさまざまな個別のクエリ間で同一である (つまり、同一である) データ行のみを返すことです。あらゆる点で同一の行のみが返されます。

簡単な例は次のようになります。

select
    *
from
    colors
where
    ID>2
intersect
select
    *
from
    colors
where
    id<4

通常のunionクエリではテーブルのすべての行が返され (最初のクエリは を超えるものID>2、2 番目のクエリは を持つものを返すID<4)、完全なセットが返されますが、交差クエリではid=3両方の条件を満たす行のみが返されます。

データベースがintersectクエリをサポートしていない場合は、次のクエリを使用して上記を簡単に実現できます。

select
    a.ID,
    a.color,
    a.paint
from
    colors a
        join colors b
            on a.ID=b.ID
where
    a.ID>2
    and b.ID<4

+----+-------+----------+
| ID | color | paint    |
+----+-------+----------+
|  3 | Blue  | Metallic |
+----+-------+----------+
1 row in set (0.00 sec)

交差クエリを本質的にサポートしていないデータベースを使用して、2 つの異なるテーブル間で交差を実行する場合は、テーブルのすべての列に結合を作成する必要があります。

おすすめ記事