Movie scene detection is challenging because it requires accurately measuring the relationship among shots to find scene boundaries. Most existing methods measure it based on similarity, which may result in the inability to distinguish between similar shots from different scenes and overlook the relationship between dissimilar shots within the same scene. In this paper, we propose a movie scene detection method based on Clue Relationship and Constrained Shot Description (CRCSD) to address the above challenges. First, we propose self-clue and entangled-clue, and rebalance the relationship among shots through clue relevance (CR) to distinguish similar shots from different scenes and bridge dissimilar shots in the same scene. We utilize the information of an entire clue rather than discrete shots for movie scene detection, which is more in line with human thinking habits. Second, we propose Shotboard, which adds constraints with metadata and shot properties to generate descriptions of shots from the camera's perspective. A BEncoder is used to extract board features of these shot descriptions, thereby establishing latent associations across shots, which can further alleviate the above challenges. Finally, we build two modal clue graphs, transfer weights between graphs, and propagate messages within graphs to learn shot features with clue context to identify the ending shot of a scene. Experiments on multiple public datasets show that our method can significantly improve the performance of movie scene detection. For example, we improve the Average Precision (AP) by 5.5% on the MovieNet dataset. The code is available at https://github.com/KJWQYY/CRCSD.