返回 Discover
Field DispatchStack Overflow9 · 2026-05-27

Query with nested joins taking a very long time

Tags
sqlpostgresql
Score
1
Answers
1
Views
127
Answered
Yes
Accepted
Yes
痛点分析发布于 2026/05/26

痛点为 AI 基于上游原始证据的初步提炼;未包含额外中国市场检索。

痛点

用户需要从4张关联表中统计每个域名的渠道数、销售数和访问数,但使用嵌套LEFT JOIN配合COUNT(DISTINCT)的查询“永远不返回结果”。现有流程中,用户试图通过一次查询完成所有聚合,但由于多表连接产生大量中间行,加上COUNT(DISTINCT)对资源消耗极大,导致查询超时。这造成用户无法及时获取业务统计报表,被迫等待或放弃查询,严重影响数据分析效率。

§ Dossier

Stack Overflow question

I have 4 tables: domains: :has many => channels :has_many => sales :has_many => visits I'm trying to get all my domains, channels count, sales count and visits count for each domain in this query: SELECT kd.id, kd.domain, COUNT(distinct kc.id) AS channels_count, COUNT(distinct kv.id) AS visits_count, COUNT(distinct kv.ip_address) AS visitors_count, COUNT(distinct ks.id) AS sales_count FROM domains AS kd LEFT JOIN channels AS kc ON(kc.domain_id=kd.id) LEFT JOIN sales AS ks ON(ks.channel_id=kc.id) LEFT JOIN visits AS kv ON(kv.channel_id=kc.id) GROUP BY kd.id I have a few thousand records in sales and visits table, nothing big but my query takes forever. It never comes back. Any idea why?

§ Dossier

Question details

View count
127
Answer count
1
Last activity
2026/05/26
§ Dossier

Answers

COUNT(DISTINCT) can be a resource hog. In addition, your tables probably have many duplicates on the kd.id column, resulting in zillions of rows being generated. A more efficient approach is to do the aggregation first, and then join the results together. I think the following query does what you want (I'm assuming that the id columns are unique in each table): SELECT kd.id, kd.domain, channels_count, visits_count, visitors_count, sales_count FROM domains kd LEFT JOIN (select kc.domain_id, count(*) as channels_count from channels kc group by kc.domain_id ) kc ON(kc.domain_id=kd.id) LEFT JOIN (select ks.channel_id, count(*) as sales_count from sales ks group by ks.channel_id ) ks ON(ks.channel_id=kc.id) LEFT JOIN (select kv.channel_id, count(*) as visits_count, count(distinct ipaddress) as visitors_count from visits kv on kv.channel_id ) ON(kv.channel_id=kc.id) GROUP BY kd.id

评论作者信息不可用Accepted2 votes
源数据· Raw Archive
source
Stack Overflow
upstream_source
stackoverflow
upstream_item_id
11680129
daily_ranking_item_id
e53a7123-4122-4d46-ac26-22112b81fca3
rank_date
2026-05-27
rank
9
name
Query with nested joins taking a very long time
tagline
sql, postgresql
description
I have 4 tables: domains: :has many => channels :has_many => sales :has_many => visits I'm trying to get all my domains, channels count, sales count and visits count for each domain in this query: SELECT kd.id, kd.domain, COUNT(distinct kc.id) AS channels_count, COUNT(distinct kv.id) AS visits_count, COUNT(distinct kv.ip_address) AS visitors_count, COUNT(distinct ks.id) AS sales_count FROM domains AS kd LEFT JOIN channels AS kc ON(kc.domain_id=kd.id) LEFT JOIN sales AS ks ON(ks.channel_id=kc.id) LEFT JOIN visits AS kv ON(kv.channel_id=kc.id) GROUP BY kd.id I have a few thousand records in sales and visits table, nothing big but my query takes forever. It never comes back. Any idea why?
votes_count
1
comments_count
1
created_at_on_source
2012-07-27T00:57:22.000Z
topics
sqlpostgresql
media / source-specific data
{
  "stackoverflow": {
    "score": 1,
    "view_count": 127,
    "is_answered": true,
    "top_answers": [
      {
        "body": "COUNT(DISTINCT) can be a resource hog. In addition, your tables probably have many duplicates on the kd.id column, resulting in zillions of rows being generated. A more efficient approach is to do the aggregation first, and then join the results together. I think the following query does what you want (I'm assuming that the id columns are unique in each table): SELECT kd.id, kd.domain, channels_count, visits_count, visitors_count, sales_count FROM domains kd LEFT JOIN (select kc.domain_id, count(*) as channels_count from channels kc group by kc.domain_id ) kc ON(kc.domain_id=kd.id) LEFT JOIN (select ks.channel_id, count(*) as sales_count from sales ks group by ks.channel_id ) ks ON(ks.channel_id=kc.id) LEFT JOIN (select kv.channel_id, count(*) as visits_count, count(distinct ipaddress) as visitors_count from visits kv on kv.channel_id ) ON(kv.channel_id=kc.id) GROUP BY kd.id",
        "score": 2,
        "answer_id": 11680182,
        "is_accepted": true
      }
    ],
    "answer_count": 1,
    "accepted_answer_id": 11680182,
    "last_activity_date": 1779831004
  }
}
raw_payload
{
  "stats": {
    "score": 1,
    "view_count": 127,
    "is_answered": true,
    "answer_count": 1,
    "creation_date": 1343350642,
    "last_edit_date": 1779831004,
    "accepted_answer_id": 11680182,
    "last_activity_date": 1779831004
  },
  "api_wrapper": {
    "backoff": null,
    "has_more": true,
    "page_size": 8,
    "quota_max": 300,
    "quota_remaining": 291
  },
  "question_id": 11680129,
  "answer_fetch": {
    "has_more": false,
    "answers_fetched": 1,
    "answer_page_size": 3
  },
  "snapshot_version": "stackoverflow_question_v1"
}
source_raw_snapshot
{
  "id": "34a457a4-7f2b-46eb-ad33-7dd3e3d02976",
  "daily_ranking_item_id": "e53a7123-4122-4d46-ac26-22112b81fca3",
  "source": "stackoverflow",
  "external_id": "11680129",
  "fetched_at": "2026-05-26T22:02:05.726Z",
  "question_raw": {
    "body": "<p>I have 4 tables:</p>\n<pre><code>domains:\n    :has many =&gt; channels\n        :has_many =&gt; sales\n        :has_many =&gt; visits\n</code></pre>\n<p>I'm trying to get all my domains, channels count, sales count and visits count for each domain in this query:</p>\n<pre><code>SELECT\n    kd.id, kd.domain,\n    COUNT(distinct kc.id) AS channels_count,\n    COUNT(distinct kv.id) AS visits_count,\n    COUNT(distinct kv.ip_address) AS visitors_count,\n    COUNT(distinct ks.id) AS sales_count\nFROM\n    domains AS kd\nLEFT JOIN\n    channels AS kc\n    ON(kc.domain_id=kd.id)\nLEFT JOIN\n    sales AS ks\n    ON(ks.channel_id=kc.id)\nLEFT JOIN\n    visits AS kv\n    ON(kv.channel_id=kc.id)\nGROUP BY kd.id\n</code></pre>\n<p>I have a few thousand records in sales and visits table, nothing big but my query takes forever. It never comes back. Any idea why?</p>\n",
    "link": "https://stackoverflow.com/questions/11680129/query-with-nested-joins-taking-a-very-long-time",
    "tags": [
      "sql",
      "postgresql"
    ],
    "owner": {
      "link": "https://stackoverflow.com/users/762174/romeo-m",
      "user_id": 762174,
      "user_type": "registered",
      "account_id": 397826,
      "reputation": 3308,
      "accept_rate": 56,
      "display_name": "Romeo M.",
      "profile_image": "https://www.gravatar.com/avatar/2e7d37c459d20cc14364f6ce352f6b8a?s=256&d=identicon&r=PG"
    },
    "score": 1,
    "title": "Query with nested joins taking a very long time",
    "view_count": 127,
    "is_answered": true,
    "question_id": 11680129,
    "answer_count": 1,
    "creation_date": 1343350642,
    "last_edit_date": 1779831004,
    "content_license": "CC BY-SA 4.0",
    "accepted_answer_id": 11680182,
    "last_activity_date": 1779831004
  },
  "answers_raw": [
    {
      "body": "<p><code>COUNT(DISTINCT)</code> can be a resource hog.  In addition, your tables probably have many duplicates on the <code>kd.id</code> column, resulting in zillions of rows being generated.</p>\n<p>A more efficient approach is to do the aggregation first, and then join the results together.  I think the following query does what you want (I'm assuming that the <code>id</code> columns are unique in each table):</p>\n<pre><code>SELECT kd.id, kd.domain, channels_count, visits_count,\n       visitors_count, sales_count\nFROM domains kd LEFT JOIN\n     (select kc.domain_id, count(*) as channels_count\n      from channels kc\n      group by kc.domain_id\n     ) kc\n     ON(kc.domain_id=kd.id) LEFT JOIN\n     (select ks.channel_id, count(*) as sales_count\n      from sales ks\n      group by ks.channel_id\n     ) ks\n     ON(ks.channel_id=kc.id) LEFT JOIN\n     (select kv.channel_id, count(*) as visits_count,\n             count(distinct ipaddress) as visitors_count\n      from visits kv\n      on kv.channel_id\n     )\n     ON(kv.channel_id=kc.id)\nGROUP BY kd.id\n</code></pre>\n",
      "owner": {
        "link": "https://stackoverflow.com/users/1144035/gordon-linoff",
        "user_id": 1144035,
        "user_type": "registered",
        "account_id": 1165580,
        "reputation": 1276902,
        "display_name": "Gordon Linoff",
        "profile_image": "https://www.gravatar.com/avatar/e514b017977ebf742a418cac697d8996?s=256&d=identicon&r=PG"
      },
      "score": 2,
      "answer_id": 11680182,
      "is_accepted": true,
      "question_id": 11680129,
      "creation_date": 1343351269,
      "last_edit_date": 1779810712,
      "content_license": "CC BY-SA 4.0",
      "last_activity_date": 1779810712
    }
  ],
  "tags_raw": [
    "sql",
    "postgresql"
  ],
  "stats_raw": {
    "score": 1,
    "view_count": 127,
    "is_answered": true,
    "answer_count": 1,
    "creation_date": 1343350642,
    "last_edit_date": 1779831004,
    "accepted_answer_id": 11680182,
    "last_activity_date": 1779831004
  },
  "selection_meta": {
    "site": "stackoverflow",
    "api_wrapper": {
      "backoff": null,
      "has_more": true,
      "page_size": 8,
      "quota_max": 300,
      "quota_remaining": 291
    },
    "answer_fetch": {
      "backoff": null,
      "has_more": false,
      "answers_fetched": 1,
      "quota_remaining": 262,
      "answer_page_size": 3
    },
    "snapshot_version": "stackoverflow_question_v1",
    "selection_strategy": "tag_whitelist_unanswered_high_score_recent_active"
  },
  "created_at": "2026-05-26T22:02:06.102Z",
  "updated_at": "2026-05-26T22:02:06.102Z"
}