WO2009133403A2 - Television system - Google Patents

Television system Download PDF

Info

Publication number
WO2009133403A2
WO2009133403A2 PCT/GB2009/050450 GB2009050450W WO2009133403A2 WO 2009133403 A2 WO2009133403 A2 WO 2009133403A2 GB 2009050450 W GB2009050450 W GB 2009050450W WO 2009133403 A2 WO2009133403 A2 WO 2009133403A2
Authority
WO
WIPO (PCT)
Prior art keywords
frame rate
signal
video
frames
colour
Prior art date
Application number
PCT/GB2009/050450
Other languages
French (fr)
Other versions
WO2009133403A3 (en
Inventor
John Thomas Zubrzycki
Thomas James Davies
David John Flynn
Matthew Edward Hammond
Stephen Jeremy Edward Jolly
Richard Aubrey Salmon
Michael Glyn Armstrong
Original Assignee
British Broadcasting Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by British Broadcasting Corporation filed Critical British Broadcasting Corporation
Publication of WO2009133403A2 publication Critical patent/WO2009133403A2/en
Publication of WO2009133403A3 publication Critical patent/WO2009133403A3/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/235Processing of additional data, e.g. scrambling of additional data or processing content descriptors
    • H04N21/2355Processing of additional data, e.g. scrambling of additional data or processing content descriptors involving reformatting operations of additional data, e.g. HTML pages
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/132Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/184Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being bits, e.g. of the compressed video stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/587Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal sub-sampling or interpolation, e.g. decimation or subsequent interpolation of pictures in a video sequence
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/63Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding using sub-band based transform, e.g. wavelets
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/80Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/234381Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by altering the temporal resolution, e.g. decreasing the frame rate by frame skipping
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/002Special television systems not provided for by H04N7/007 - H04N7/18

Definitions

  • the present invention relates in particular examples to methods and systems for providing and processing video and television signals.
  • Recently, significant efforts have been made to provide video and television products and services with improved visual quality, for example by way of high-definition televisions, set-top-boxes and broadcasts, and high- definition video media such as HD-DVD and Blu-ray.
  • these new products and services are often still based to a large extent on old video formats that give rise to certain quality limitations, for example in relation to the reproduction of fast motion.
  • Motion blur and motion aliasing artefacts are examples of problems that can arise in these systems.
  • a method of providing a digital video signal comprising encoding the signal with a colour depth of at most two bits per pixel colour component.
  • the method comprises encoding the signal with a colour depth of one bit per colour component.
  • the signal preferably has a frame rate substantially higher than conventional video, television or film frame rates.
  • the signal preferably has a high frame rate selected such that, when displayed at the high frame rate, the effective colour depth as perceived by a viewer exceeds the actual encoded colour depth.
  • increased temporal resolution can be traded against reduced colour resolution, without loss of overall colour information (or at least without substantial loss of overall colour information), because the colour information remains in the signal but is in effect temporally encoded, effectively as frequency information of 1 -bit or 2-bit colour depth colour components.
  • the quality of the video signal can be improved with regard to certain characteristics in particular in relation to representation of motion as will be explained later.
  • a method of providing a digital video signal comprising encoding the signal with a first colour depth at a high frame rate, the high frame rate selected such that, when displayed at the high frame rate, the effective colour depth as perceived by a viewer exceeds the first colour depth with which the signal was encoded.
  • the effective colour depth as perceived by a viewer when the signal is displayed at the high frame rate preferably corresponds at least to the colour depth of an image encoded with 4 bits per colour component, preferably 5 bits per colour component, more preferably 8 bits per colour component.
  • the temporally encoded colour information enables presentation or regeneration of an image from the low colour depth signal corresponding to a conventional colour encoding.
  • the frame rate is preferably sufficiently high such that colour information can be recovered from the signal to produce an equivalent of an image encoded with at least 4 bits per colour component, preferably at least 5 bits per colour component, more preferably at least 8 bits per colour component.
  • the signal has a frame rate which is sufficiently high so that display of the signal at the high frame rate produces the effect of a full colour image on a viewer.
  • the signal has a frame rate of at least 5,000 fps, preferably at least 10,000fps, more preferably at least 20,000 fps. In certain embodiments, the signal preferably has a frame rate of at least 50,000fps, more preferably at least 100,000fps.
  • the method may comprise encoding the video signal using a given conventional video format having an associated conventional frame rate, using a frame rate higher than the conventional frame rate, the high frame rate selected such that the data rate of the encoded signal at the high frame rate does not exceed the data rate of a conventional video signal encoded using the conventional video format at the conventional frame rate.
  • the method may comprise selecting the frame rate to enable transmission via a conventional video interface or channel, preferably an SDI (Serial Digital Interface) interface or channel.
  • the high frame rate may be one of 8, 10, 256 times or 1024 times the selected conventional television or film frame rate.
  • the signal is preferably encoded using one bit for any primary colour component or luma component and at most two bits for any chroma component.
  • the signal may be encoded using, for each pixel, a plurality of colour components corresponding to primary colours, each colour component encoded using (exactly) one bit.
  • the signal may be encoded in RGB colour space using one bit for each of the red, green and blue components.
  • the signal may be encoded in a format having a luma component and a plurality of (for example two) chroma (or colour difference) components, the encoding using one bit for the luma component and at most two bits for each chroma component.
  • chroma components may each be encoded with sign and magnitude bits.
  • the luma and chroma components may each be encoded with (exactly) one bit.
  • the method comprises converting the signal to a second signal having a second colour depth greater than that of the signal.
  • This may comprise performing spatial and/or temporal averaging of pixel values to obtain pixel values at the second colour depth.
  • the method may also comprise compressing the converted video signal (for example using a conventional compression algorithm).
  • pixel value as used herein encompasses individual colour component values making up a pixel.
  • the method may comprise processing the video signal to perform image stabilisation based on correlation or motion estimation between successive frames.
  • a method of processing a video signal comprising: receiving an input video signal defining a sequence of temporal image units (e.g. frames or fields) encoded with a first colour depth (preferably one or two bits per colour component as set out above); and generating an image unit (e.g. frame or field) encoded with a second colour depth greater than the first colour depth from image units (frames or fields) of the sequence, the generating comprising: aggregating source pixel values for a plurality of frames or fields in the sequence to produce output pixel values for the generated frame or field.
  • the plurality of frames or fields (or other image units) are preferably adjacent frames or fields in the sequence, but may alternatively be non-adjacent frames or fields.
  • the image units are preferably frames.
  • the input video signal is preferably at a comparatively high frame rate as set out above.
  • the output pixel values preferably correspond to an average of the source pixel values quantized to the second colour depth.
  • the pixel values may, for example, comprise separate colour components, with aggregation performed for each colour component.
  • the method may comprise generating a sequence of frames or fields with the second colour depth, each frame or field generated from a respective group of adjacent frames or fields of the input video signal.
  • each group of adjacent frames defines a window of a given number of frames or fields selected in relation to a given time instant, with an output frame or field generated for that time instant from the frames in the window.
  • the windows for successive output frames can overlap in which case an individual source frame may contribute to multiple output frames.
  • the window progresses along the signals time axis as successive output frames are generated. This allows, for example, a sequence of full colour depth frames to be generated at the same frame rate as the original low colour depth video signal.
  • the windows (groups) may be discrete in which case the output signal may have a reduced frame rate (e.g. a conventional frame rate).
  • the method comprises, for each input frame or field of the input video signal: generating an output frame or field in the output signal based on a group of frames or fields in the input signal including said input frame or field.
  • the aggregating is performed on one or more groups of adjacent frames or fields, and the number of frames or fields in the or each group is preferably selected in accordance with a variable parameter. In this way, a variable artificial shuttering effect can be applied to the signal as explained in more detail later.
  • the input video signal in this aspect is preferably a video signal as provided by any method set out above.
  • the above methods and features may, for example, be implemented in an image capturing device such as a digital still or video camera.
  • the invention provides an image capturing device comprising: image sensor means for outputting a sequence of image frames with a first colour depth of one bit per colour component; and means for generating an output image with a colour depth greater than the first colour depth from a group of (preferably adjacent) frames of the sequence, the generating means adapted to aggregate source pixel values for the group of frames to generate output pixel values for the output image.
  • the image capturing device comprises means for receiving a parameter defining a virtual shutter speed; wherein the group of frames used to generate the output image is selected to extend over a given time period in accordance with the received parameter.
  • the image sensor means is preferably adapted to output the sequence of image frames at a given frame rate, and the generating means is preferably adapted to select a number of adjacent frames for use in generating the output image in accordance with the received parameter. In this way, an artificial shuttering effect can be implemented, in which the parameter defines the shutter duration.
  • the device may be used to acquire a still image.
  • the generating means may be adapted to generate a video signal comprising a plurality of output image frames with a colour depth greater than the first colour depth.
  • the device may further comprise means for analysing image frames of the sequence of image frames to track motion and/or perform image stabilisation, wherein image stabilisation is preferably performed on the generated image.
  • the invention also provides apparatus or a television system having means for performing any method as set out herein or having means for processing, storing or transmitting a digital video signal as provided by any method as set out herein.
  • a television system adapted to provide, process or use a television signal having a frame rate substantially higher than conventional television and/or film frame rates.
  • the system comprises means for transmitting the signal over a transmission medium to a plurality of television receiver devices.
  • the signal may be transmitted at the high frame rate or as a converted signal having a lower frame rate which is derived from the high frame rate signal as is explained in more detail later.
  • the signal may be encoded with a bit depth of at most two bits per colour component, preferably one bit per colour component, as set out above.
  • the frame rate is preferably substantially higher than one or more (preferably each) of: PAL frame or field rate, NTSC frame or field rate, and standard film frame rate.
  • Such frame rates (ranging from 24fps for film to 30fps for NTSC) have been in use widely for decades (indeed since the very beginning of film and television) and have generally been felt to be adequate since they allow for the reproduction of apparently smooth moving images. This, together with the high data rates required for encoded video, has led to a prejudice in the art for choosing such relatively low frame rates.
  • underlying the present invention is the realisation that significant quality gains can be obtained by using much higher frame rates, without necessarily requiring a proportional increase in data rates.
  • the frame rate is at least 80fps, preferably at least 100fps, more preferably at least 150fps. Even higher frame rates may lead to correspondingly greater improvements in video quality. Accordingly, the frame rate is advantageously at least 300fps, preferably at least 600fps, more preferably at least 1200fps. Still higher frame rates may be used, for example at least 2500fps or at least 5000fps. Some embodiments use frame rates of at least 10,000fps or even at least 100,000fps. Examples of such embodiments are described below in connection with a one-bit colour component encoding.
  • the frame rate is a multiple of one or more selected conventional television frame or field rates or film rates.
  • the frame rate may be a multiple of each of a plurality of selected conventional television frame or field rates or film rates. This enables easier conversion between the relevant conventional rates and the high frame rate.
  • the frame rate may be a multiple of one or more, preferably each, of: 25fps PAL frame rate, 30fps approximate NTSC rate and 24fps standard film rate.
  • the frame rate is a multiple of one or more, preferably each of: 50 fields-per-second PAL field rate, 60 fields-per- second approximate NTSC field rate, and 24fps standard film rate.
  • the frame rate is a multiple of each of the above (24fps, 25fps, 30fps), and thus is preferably 600fps or a multiple thereof.
  • the term television signal preferably refers to a video signal which can be broadcast on a television network, but may also include such a signal as recorded by a camera and/or processed in a production/editing system prior to broadcast.
  • the signal preferably defines a sequence of images displayable at the frame rate. In other words, a sequence of images (or frames) for display at the frame rate and at a given display resolution are encoded in the signal (though the encoding may use compression including, for example, frame prediction, so that each frame need not necessarily be encoded pixel-by- pixel).
  • the signal is preferably a digital television signal, more preferably a digital television broadcast signal for broadcast on a digital television network.
  • the method may comprise sampling one or more chroma components at a lower temporal rate than a luma (or luminance) component, preferably at half the luma sampling rate or less, more preferably at a quarter the luma sampling rate or less. Spatial chroma sub-sampling may be used in addition to temporal chroma sub-sampling.
  • the signal may be encoded with a bit depth of at most two bits per colour component, preferably with one bit per colour component or with one- bit plus a sign bit (depending on the colour space used).
  • a frame rate of at least 10,000fps may be used in this case. Given a sufficiently high frame rate, this can enable efficient encoding and processing of the signal.
  • the system preferably comprises means for providing source video at the high frame rate, the providing means preferably including one or more cameras operable to capture source video at the high frame rate.
  • Other high frame rate sources may be used, for example a high frame rate animation system.
  • the source video may form the television signal or may be incorporated into the television signal (for example by editing to combine with other sources).
  • the system may also comprise means for performing video editing or video processing at the high frame rate (on the high frame rate signal or to produce the high frame rate signal).
  • the system comprises means for performing video effects processing on the high frame rate signal to add a video effect to the high frame rate signal, preferably a shuttering effect or a lighting effect.
  • a high frame rate certain new video processing effects can be implemented.
  • the system preferably comprises means for transmitting the high frame rate signal to end user equipment, the end user equipment preferably adapted to output the signal at the high frame rate.
  • the system may include the end user equipment, which may comprise one or more televisions, set top boxes, cinema display systems, personal computers, mobile devices, or the like.
  • the transmitting means preferably includes a broadcast system for broadcasting the television signal on a broadcast network, such as a digital terrestrial, cable or satellite network.
  • the television system may thus include one or more video cameras or other sources, video production/editing system(s), transmission system(s) and end user equipment all operating at the high frame rate, thus providing a complete high frame rate television production, distribution and display system.
  • the system may comprise means for compressing the signal, preferably using three-dimensional block-based coding, preferably using temporal and/or spatial prediction of three-dimensional blocks of video data.
  • the system may comprise means for converting the television signal at the high frame rate to a low frame rate signal, preferably at a standard television frame rate or film rate, in which case the system may further provide means for transmitting the low frame rate signal to end user equipment. This can enable interoperability with conventional end user equipment which is not compatible with the higher frame rate.
  • the system may comprise means for deriving information for use in compression from the high frame rate signal, and means for compressing the low frame rate signal using the derived information.
  • the system may comprise means for deriving motion information from the high frame rate signal and means for compressing the low frame rate signal using the derived motion information. This can improve compression efficiency and quality, allowing the low frame rate signal to benefit from the additional information available in the high frame rate source signal.
  • the system may alternatively or additionally comprise means for deriving motion information from the high frame rate signal, and means for transmitting the motion information with the low frame rate signal to a receiver.
  • the receiver e.g. end user equipment
  • the receiver may be adapted, using the received motion information, to output a signal at a frame rate higher than the low frame rate at which the signal was transmitted, preferably at the original high frame rate.
  • reproduction quality of the low frame rate signal may be improved using information derived from the high quality signal.
  • the invention provides apparatus having means for processing a television signal having a frame rate substantially higher than conventional television or film frame rates, and a method of providing a television signal having a frame rate substantially higher than conventional television or film frame rates, both optionally with corresponding preferred features.
  • the apparatus may be a digital television receiver or set-top box, or a display, for displaying the signal at the high frame rate, a camera for capturing the signal at the high frame rate, or a video processing/editing apparatus for processing/editing video at the high frame rate.
  • the invention also provides a corresponding video editing system comprising means for editing video at a frame rate substantially higher than conventional television or film frame rates, again optionally with corresponding preferred features.
  • a method of providing a digital video signal comprising encoding the signal with a bit depth of at most two bits per colour component.
  • the encoding may use one bit plus a sign bit for one, some or all colour components and/or may use a single bit for one, some or all colour components.
  • monochrome video there may only be a single colour component for each pixel, whilst a colour image will have multiple (typically three) colour components per pixel.
  • the video signal is preferably colour video.
  • the signal preferably has a frame rate substantially higher than conventional television or film frame rates.
  • the signal has a frame rate which is sufficiently high so that display of the signal at the high frame rate produces the effect of a full colour image on a viewer (due to the integration performed by the eye the one-bit nature of each frame becomes substantially imperceptible, i.e. the viewer sees colours that are not actually present at any given instant in time because the pixels are switched too rapidly to be perceived separately).
  • full colour image preferably means an image having a colour range corresponding at least to conventional standard or high definition television or video formats.
  • the signal has a frame rate of at least 10,000fps, more preferably at least 100,000fps. In some cases, frame rates of at least 250,000fps, at least 500,000fps or even at least 1 ,000,000fps may be used. By using an extremely high frame rate of for example at least 10,000fps the effect of a colour image can be created despite the 1 -bit sampling used.
  • the method may further comprise converting between the one-bit (or one-bit plus sign) video format and a video format using multiple (more) bits per colour component, preferably a standard television or film format.
  • the invention also correspondingly provides a camera adapted to record a digital video signal with a bit depth of at most two bits, preferably one bit per colour component (or one-bit plus sign as described above), and a television receiver or display adapted to receive, decode, output and/or display a digital video signal having a bit depth of at most two bits, preferably one bit (or one-bit plus sign) per colour component.
  • the invention provides a video editing system comprising means for editing video encoded with a bit depth of at most two bits, preferably one bit per colour component (or one-bit plus sign), preferably at a frame rate substantially higher than conventional television or film frame rates, preferably at least 10,000fps, more preferably at least 100,000fps.
  • the invention also provides a video conversion system comprising means for converting between a first video format having a high frame rate and a bit depth of at most two bits, preferably one bit per colour component (or one bit plus sign) and a second video format having a lower frame rate and a bit depth of multiple (or more) bits per colour component.
  • the second video format is preferably a standard television or film format, preferably a PAL or NTSC or HDTV television format.
  • the invention provides a method of providing a video signal, comprising sampling at least one colour component of the signal at a different temporal rate to one or more other colour components.
  • one or more chroma components are sampled at a lower temporal rate than a luma (or luminance) component, preferably at half the luma sampling rate or less, more preferably at a quarter the luma sampling rate or less.
  • Spatial chroma sub-sampling may be used in addition to temporal chroma sub-sampling.
  • the signal preferably has a frame rate substantially higher than conventional rates as described above and/or may be encoded with two, preferably one, bits per colour component as described above.
  • the invention provides a method of providing a compressed video signal, comprising: receiving a video signal at a first frame rate; deriving information for use in compression from the video signal at the first rate; converting the video signal at the first frame rate to a video signal at a second frame rate different from the first frame rate; and compressing the video signal at the second frame rate using the derived information.
  • the second frame rate is preferably lower than the first frame rate. In this way, compression quality and efficiency for the lower frame rate signal can be improved using information from the higher frame rate signal.
  • the information is preferably motion prediction or motion vector information (e.g. one or more motion vectors) for use in interframe compression of the video signal.
  • the video signal may be encoded using a bit depth of one bit or another encoding described herein.
  • the invention provides a method of transmitting a video signal, comprising: receiving a video signal at a high frame rate; deriving motion information from the high frame rate video signal; converting the high frame rate video signal to a low frame rate video signal; transmitting the low frame rate video signal and the motion information to a receiver; and at the receiver, deriving from the received low frame rate signal a signal at an output frame rate higher than the low frame rate at which the signal was transmitted using the received motion information, and outputting the derived signal at the output frame rate.
  • the output frame rate may be equal to the high frame rate.
  • the invention provides a method of processing video, comprising: receiving a video signal at a high frame rate, the high frame rate being substantially higher than conventional television frame rates; and performing video processing at the high frame rate to produce a processed video signal at the high frame rate.
  • the method comprises converting the processed video signal to a low frame rate lower than the high frame rate, preferably a standard television or film frame rate (e.g. 24fps, 25fps or 30fps); and outputting the low frame rate processed video signal.
  • a standard television or film frame rate e.g. 24fps, 25fps or 30fps
  • the step of performing video processing may comprise performing video editing.
  • Video editing at high frame rate can enable greater editing accuracy.
  • video processing may comprise performing effects processing, preferably to apply a shuttering or lighting effect.
  • the step of performing video processing may comprise applying a synthetic camera shuttering effect to the video signal.
  • a method of processing video comprising: receiving a video signal at a given (preferably high) frame rate, the given frame rate preferably being substantially higher than conventional television or film frame rates; and processing the signal to apply a synthetic camera shuttering effect to the video signal.
  • Applying a camera shuttering effect may comprise applying one or more temporal filters to the video signal.
  • the one or more temporal filters may include one or more of: a non-rectangular temporal filter; an exponential temporal filter; a Gaussian temporal filter; a sine function temporal filter; a windowed sine function temporal filter; a non-linear temporal filter such as a median or rank order or morphological filter.
  • the shuttering effect may be applied differentially across the video image.
  • the shuttering effect may be applied differentially depending on the type of motion. Shuttering may be applied differentially for different colour components.
  • the step of performing video processing may comprise detecting the effect on a scene (as represented in one or more frames) caused by a variable light source based on the characteristic rate and/or phase of variation of the light source, and may optionally comprise modifying the scene accordingly.
  • the step of performing video processing may comprise identifying regions of or objects in one or more video images where elements of the scene are illuminated by a given light source by detecting the characteristic rate and/or phase of flicker of the light source.
  • the method may further comprise modifying one or more video images or identified objects or regions thereof to modify, reduce or remove illumination caused by the light source.
  • the method may comprise distinguishing between multiple light sources based on respective flicker rates or phases of the light sources.
  • the method may also comprise processing the video image to remove a short instantaneous flash from a particular light source (e.g. due to flash photography or lightning).
  • a short instantaneous flash from a particular light source e.g. due to flash photography or lightning.
  • the high frame rate can thus enable detection of image contributions which might not be detectable at conventional frame rates.
  • the method may comprise detecting the one or more frames or parts of frames illuminated by the flash based on one or more of: higher than average luma; desaturation of colours; a degree of white clipping exhibited by the signal.
  • the method comprises removing the flash at least partially by a process including one or more of: applying a median filter or other filter in time to the detected frames or parts of frames; reducing the luma level of the detected frames or parts of frames to that of surrounding frames or parts of frames; replacing one or more affected frames or parts of frames of the image with frames or parts of frames generated by interpolation from surrounding frames or parts of frames.
  • the term surrounding may here mean, in relation to parts of frames, spatially or temporally adjacent or near; and, in relation to whole frames, temporally adjacent or near.
  • the step of performing video processing may also comprise filtering a video sequence.
  • Filtering may be applied differentially across an image.
  • Temporal filtering may be performed.
  • filtering may be performed by applying a three dimensional convolution function to the video sequence, wherein the convolution function is shaped to filter along the trajectory of motion present in the video sequence.
  • the invention provides a method of processing video to detect or modify the contribution to a scene from a regularly varying light source, the method comprising: capturing the video at a frame rate which is greater than the rate of variation of the light source; and detecting a contribution to one or more frames of video due to the light source based on the rate and/or phase of variation of the light source; and optionally further comprising modifying one or more of the frames of video based on the detected contribution.
  • the frame rate is preferably substantially greater than the rate of variation or flicker rate of the light source, preferably at least twice the flicker rate, more preferably at least four times or at least ten times the flicker rate.
  • the frame rate may be as described above.
  • the method may comprise modifying the frame or frames to modify, increase, reduce or remove the detected contribution due to the light source.
  • the frame rate of the source video is preferably at least twice the rate of variation (the flicker rate or frequency) of the light source. Multiple light sources may be detected/processed, in which case the frame rate is preferably at least twice the maximum frequency of the light sources.
  • the contribution may be detected in part based on differences in colour temperature. Modifying one or more frames may comprise correcting differences in white balance or colour shading within the one or more frames.
  • the invention also provides a method of compressing a video signal, preferably (though not necessarily) having a frame rate substantially higher than conventional television or film frame rates, the video signal having two spatial dimensions and a temporal dimension, the method comprising: dividing the signal into a plurality of three-dimensional blocks of video data; and encoding the three-dimensional blocks.
  • Encoding a block preferably comprises calculating a frequency domain transform of the block; and quantising the resulting transform coefficients.
  • the method preferably comprises calculating prediction information for a block, and encoding the block using the prediction information.
  • the method preferably comprises performing spatial and/or temporal prediction. Using such a three-dimensional compression process, in particular for a high frame rate signal, can improve compression efficiency by exploiting the high degree of similarity between frames.
  • the invention also provides a method for decompressing a signal compressed
  • the invention also provides a video processing system or apparatus adapted to perform any method as described herein, and a computer program or computer program product comprising software code adapted, when executed on a data processing apparatus, to perform any method as described herein.
  • the invention also provides a computer program and a computer program product for carrying out any of the methods described herein and/or for embodying any of the apparatus features described herein, and a computer readable medium having stored thereon a program for carrying out any of the methods described herein and/or for embodying any of the apparatus features described herein.
  • the invention also provides a signal embodying a computer program for carrying out any of the methods described herein and/or for embodying any of the apparatus features described herein, a method of transmitting such a signal, and a computer product having an operating system which supports a computer program for carrying out any of the methods described herein and/or for embodying any of the apparatus features described herein.
  • Figure 2 illustrates a process of capturing, processing and displaying a high frame rate video signal
  • Figure 3 illustrates low frame rate and high frame rate signals
  • Figure 4 illustrates a process for high frame rate video processing and downsampling
  • Figure 5 illustrates a video compression process using a high frame rate video source
  • Figure 6 illustrates a method of video compression
  • Figure 7 illustrates three-dimensional (spatial/temporal) block coding and prediction
  • Figure 8 illustrates 1 -bit image capture.
  • High frame rate video system A high frame rate television/film production system is shown in overview in Figure 1.
  • the system comprises a high frame rate camera 12 for providing source video and a high frame rate editing system 16 for editing the captured video.
  • Storage 14 is provided for storing unedited and edited video content, and may include disc storage, tape storage, and/or any other suitable storage media.
  • Editing system 16 may make use of a variety of editing/post-production components, such as an animation generator 18, effects unit 20, graphics generator 22 and video mixer 24.
  • Edited or raw video content is transmitted using a broadcast subsystem 26 to a broadcast network 30, such as a digital or analogue terrestrial, satellite or cable network.
  • Video content may also be made accessible via a media server 28 to be accessed over a data network, for example the Internet 32.
  • Transmitted content is received from the broadcast network or Internet by end-user devices such as a set-top box 34, personal computer 36 or cinema display system 38.
  • All the elements of the system are adapted to operate at a substantially higher frame rate than is used in current conventional television/film systems, as is described in more detail below.
  • Figure 1 shows representative elements of such a television system.
  • a given system may contain only some of the elements shown and may contain other elements not shown. Only one of each element is shown, though the system will typically include multiple such elements, e.g. multiple cameras, set-top boxes, effects processors etc.
  • the system will typically also include compression/decompression functionality to enable compression of digital video data for storage and transmission.
  • Embodiments of the invention provide a high frame/field rate television and film system, in which the quality of the reproduced image is improved by the use of increased frame and/or field rates throughout the entire system.
  • video or film pictures are captured at a rate in excess of the conventional 24/50/60 frames/fields per second and are stored, transmitted and reproduced at this higher frame/field rate.
  • the system could run at a rate of 300 frames per second.
  • editing system components e.g. animation generators 18, graphics generators 22, video mixers 24 and effects units 20
  • the transmission system then broadcasts this high frame rate signal to the home or to cinemas or other places for decoding and display, again at the higher frame rate.
  • the high frame rate signal can also be recorded onto fixed media such as disc/tape or downloaded file for sale to the home or onto other storage media for distribution to cinemas or other venues.
  • the resulting TV pictures can give far better rendition of motion and of moving detail than current TV and film systems, in particular by reducing motion blur.
  • the reduction in motion blur can greatly improve the sharpness of anything in the picture which is moving relative to the camera/display.
  • the removal of motion blur can also improve the sense of reality of the image, removing the tendency for moving objects to become transparent.
  • the sharpening of moving edges can lead to an improved occlusion of backgrounds improving the sense of the image being real and three- dimensional.
  • the resulting TV images can also have lower levels of motion aliasing.
  • Increased frame rates at the camera and display can also reduce other flicker and motion related effects such as photosensitive and pattern-sensitive epilepsy.
  • Video effects processing (e.g. by animation generator 18, effects unit 20, graphics generator 22 and video mixer 24) can be performed at the higher frame rate to achieve higher-quality effects.
  • Video editing can be performed (e.g. by editing system 16) more accurately due to the increased temporal resolution.
  • a 3D animation/rendering system may provide the source video at the high frame rate.
  • the frame rate chosen is preferably substantially higher (preferably many times higher) than the conventional film / television frame or field rates.
  • Such conventional rates include 24fps film rate; 25fps television (e.g. PAL), 30fps/29.97fps television (e.g. NTSC); and corresponding field rates of 50 / 60(59.94) fields-per-second.
  • the NTSC rate of 29.97fps can be approximated to 30fps; for example, standards conversion can be achieved by accelerating a 29.97fps signal to 30fps prior to performing further processing.
  • references herein to a 30fps rate are intended to also include the standard NTSC rate of 29.97fps (and references to 60 fields- per-second field rates similarly include the standard NTSC field rate of 59.94 fields-per-second).
  • a frame rate of at least 80fps is therefore preferably used.
  • the frame rate is at least 100fps, more preferably at least 150 fps, more preferably at least 300fps.
  • an even higher frame rate may be selected, for example at least 600fps, at least 900fps or even at least 1200fps.
  • the frame rate chosen has a simple mathematical relationship with one or more selected conventional frame rates.
  • the chosen frame rate is a multiple of at least one selected conventional frame rate or field rate. This simplifies conversion between a conventional signal and a high frame rate signal.
  • the frame rate is a multiple of several conventional frame/field rates, to improve interoperability with selected existing systems.
  • the rate can be chosen based on requirements, i.e. based on the conventional systems with which interoperability is desired.
  • a frame rate may be selected which is a multiple of both conventional television frame rates of 25 and 30, e.g. 150fps, or which is a multiple of both conventional television field rates of 50 and 60, e.g.
  • a frame rate with a simple mathematical relationship with (e.g. being an exact multiple of) each of the television frame/field rates of 25fps/30fps and 50 and 60 fields-per-second and also the conventional film rate of 24fps is particularly preferred, such as 600fps or a multiple thereof, since this simplifies conversion between film sources as well as NTSC, PAL and HDTV systems.
  • the frame information for the high frame rate signal may be encoded in any suitable way, for example using conventional 8-bit or 10-bit per colour component encoding in RGB, YUV, YCrCb or any other suitable colour space.
  • Conventional compression techniques may also be used to reduce the data rate of the signal, including intraframe and interframe compression (e.g. using motion prediction).
  • embodiments of the invention thus provide an improved television, video and/or film system utilising higher frame/field rates than current systems.
  • the system comprises a camera and display, connected together via a link capable of capturing, conveying and displaying pictures at rates higher than current television, video and film rates. This can be attached to storage for the recording of these pictures or connected to a link for live broadcast of these pictures.
  • a link or broadcast system capable of sending these high frame/field rate pictures to other locations and a receiver.
  • a storage system capable of recording the signals from the camera and production system and playing them back for further manipulation or broadcasting.
  • embodiments of the invention may provide:
  • a system of volumetric coding for video compression which takes advantage of the improved temporal sampling, eliminating conventional motion compensation and using volumetric and directional transformations.
  • a system of noise reduction which takes advantage of the higher frame/field rate to improve the ability to differentially reduce noise in different parts of the picture • A system of noise reduction which takes advantage of the higher frame/field rate to improve the ability to reduce noise in moving objects by filtering along the direction of motion
  • Capturing video at very high frame rates in conventional artificial lighting conditions may be problematic in that light intensity may vary in accordance with the phase of 50Hz or 60Hz mains power sources. This could introduce marked illumination differences between frames and "beat notes" in the resulting video which could have a temporal frequency on only a few hertz due to timing differences between the video and the light sources.
  • Embodiments may therefore also include: • A set of temporal-spatial filters to remove colour and level differences between frames in conventional artificial lighting
  • a further embodiment will now be described which provides a television system whereby the quality of the reproduced image is improved by the use of one bit sampling of the image at very high frame rates.
  • video or film pictures are captured at a bit-depth of one bit per colour component at a rate greatly in excess of the current 24/50/60 frames/fields per second, and are stored, transmitted and reproduced using the one bit representation.
  • the representation may use RGB, YUV, YCrCb or any other suitable colour space, with each component sampled with a one-bit bit depth.
  • the frame rate is selected to be sufficiently high so that the integration performed by the human eye renders the one-bit nature of each frame imperceptible.
  • the 1 -bit system preferably uses a frame rate of at least around 10,000fps. Some embodiments use a frame rate between 100,000fps and 1 ,OOO,OOOfps, although still higher frame rates could also be used.
  • a high frame rate camera 40 captures a high frame rate 1 -bit signal 42. Processing 44 may optionally be performed. The high frame rate 1 -bit signal is then provided to a suitably adapted display 46. Alternatively, conversion stage 48 may be provided to convert the signal to a conventional low frame rate signal suitable for a standard display 50.
  • the system may include the various other elements discussed previously in relation to Figure 1.
  • This embodiment shares many advantages with the previously described embodiments, for example in relation to the realism of the images and the rendition of motion, in particular the reduction of motion blur and motion aliasing artefacts.
  • the use of one bit sampling and reproduction can also reduce and possibly eliminate other flicker and motion related effects such as photosensitive and pattern-sensitive epilepsy.
  • the system can be a good match to certain modern display technologies such as Plasma Display Panels or back- and front-projected systems based on Digital Mirror Devices, as only the driving electronics would need to be changed to make them one-bit video compatible.
  • an improved television, video and/or film system utilising bit-depths of one bit and much higher frame/field rates than current systems is provided.
  • the system includes a display capable of displaying video frames with a bit-depth of one bit (per colour component) at a frame rate much higher than that of conventional television and cinema: high enough that the integration performed by the human eye renders the one-bit nature of each frame imperceptible.
  • Plasma Display Panels and Digital Mirror Devices are examples of technologies that could be used to implement such a display.
  • each colour component instead of displaying a pixel with different brightness values for each of a number of colour components (typically RGB for digital displays), each colour component simply has two states: on or off.
  • the pixel component may be switched on and off rapidly to produce the effect of intermediate brightness values. This effect occurs since the human eye is unable to resolve the transitions at the very high frame rates used. Different switching frequencies of a colour component then correspond to different brightness values for the colour component.
  • the viewer in effect sees a full colour image (i.e. an image providing a greater colour range than a static single-bit-per-component image could actually provide), preferably substantially corresponding in colour range at least to a conventional television image.
  • the system may further include a television camera containing a sensor capable of capturing, digitising and outputting video frames with a bit- depth of one bit, at a rate equal to or greater than the display rate described above.
  • a conventional digital video camera that supports image capture at a sufficiently high frame rate could be used to acquire one-bit video merely by applying a threshold to each pixel value and making appropriate use of dithering or error feedback techniques.
  • an image sensor specifically adapted for 1 -bit sampling may be used.
  • An unshuttered image capture device may be used.
  • capture of 1 -bit video can use delta-sigma modulation, used as an analogue-to-digital converter (ADC) where the 1 -bit video signal can be considered as pulse-density modulation (the level over a given time period is given by the number of pulses or non-zero values over that period).
  • ADC analogue-to-digital converter
  • each pixel (or pixel colour component) in the sensor accumulates charge in proportion to the light that has fallen on it, and may be considered as an integrator circuit.
  • This is connected with a threshold detector which determines whether the pixel charge is greater than a pre-determined threshold.
  • the detector output a pulse of size equal to the threshold if the charge is greater than the threshold, and this signal is output.
  • the output of the threshold detector is subtracted from the pixel level, so that its level is that of the excess of its original level over the threshold, and any further amount that has accumulated in any intervening time.
  • Figure 8 shows pixel accumulater 122 providing an accumulated pixel value to threshold detector 124. If the accumulated pixel value exceeds the threshold, a '1 ' is output (128) and the threshold value is subtracted by subtractor 120. Otherwise a '0' is output.
  • the input (126) is light level and the output (128) is a pulse train.
  • the threshold detector will work on a clock (130) equal to the 1 -bit sample rate.
  • FIG. 3 illustrates an example of a high-frame-rate signal 64 including a sequence of frames at a high rate.
  • a given pixel 66 of a frame's array of pixels comprises RGB colour components encoded using a single bit each.
  • a multi-bit sampled low frame rate signal 60 is also shown by way of comparison.
  • the system also includes a digital link or broadcast system with a bandwidth sufficient to carry the data being output by the camera and/or fed into the display in a compressed or uncompressed form.
  • the system may also incorporate the following: • A production system capable of editing, mixing and otherwise manipulating video in the one-bit format.
  • a storage system capable of recording the signals from the camera and production system and playing them back for further manipulation or broadcasting.
  • a conversion system such as a system based on the principles of delta-sigma modulation to enable conventional film and video pictures to be converted to the one-bit video format, and video in the one-bit format to be down converted to current film and TV frame/field rates. This is discussed in more detail below.
  • Conversion between one-bit and conventional video formats could be accomplished in a number of ways.
  • Existing conversion techniques for converting between PCM (pulse code modulation) and PDM (pulse density modulation) signals can be extended into the video domain.
  • the video signal can be put through a video standards-converter to increase its frame rate to that of the one-bit representation, and then the pixel signals (the signals corresponding to the values of each pixel as a function of time in the upconverted video stream) put through a delta-sigma modulator.
  • Video captured in one-bit format at a higher frame or field rate is not necessarily equivalent to video captured with higher bit depth at a lower frame or field rate, since typical shuttering in cameras discards some of the information that will be captured in a one-bit system.
  • One bit video can allow for a fully flexible trade-off between bit depth/dynamic resolution and temporal and spatial resolution. It therefore supports very high spatial resolution video such as Ultra High Definition Television (UHDTV), currently restricted to a maximum of 60 frames per second, to be displayed at a much higher frame rate, but using less dynamic resolution. This restores a more appropriate balance between temporal and spatial resolution for these standards.
  • UHDTV Ultra High Definition Television
  • Conversion from one-bit to multi-bit conventional video effects an averaging of the video signal along a temporal axis.
  • full bit depth frames at different points in time may be constructed.
  • full bit depth frames at the higher frame or field rate can be obtained, to enhance playback especially for slow-motion effects.
  • One bit video with standard definition or high definition spatial dimensions captured at 8 or 10 times the frame or field rate may provide no increase to the overall data rate over normal SD or HD signals.
  • Such a signal could be captured, stored and transmitted using standard video formats such as Serial Digital Interface (SDI) and the accompanying widely deployed SDI infrastructure. It could also, after conversion to multi-bit format, be compressed using conventional video coding methods, systems and standards.
  • SDI Serial Digital Interface
  • the difference between one-bit and multi-bit video would usually only become apparent, in this case, upon display with a one-bit capable display.
  • a conventional display would display the video with reduced temporal resolution but increased dynamic resolution. If required, both temporal and spatial averaging could be applied in the conversion process.
  • the colour difference components Cb and Cr also have an extra bit initially; however, the scaling factors applied to Cb and Cr are used to reduce the dynamic range of the colour components from 9 or 11 bits back down to 8 or 10 bits.
  • the scaling factors applied to Cb and Cr are used to reduce the dynamic range of the colour components from 9 or 11 bits back down to 8 or 10 bits.
  • the 1-bit system we have the option of keeping the colour difference components as two bits, or scaling them to reduce them to 1 bit. For example, this can be done by treating the 2-bit signal as if it is an analogue signal itself, and choosing a threshold determined by a scaling factor to get a 1 -bit signal. To summarise the above, a 1-bit capture format will provide 1 -bit data for each of the captured primary colours.
  • Matrixing this will naturally produce a 1 -bit signal for luma and 2-bit signals for other components.
  • the other components may also optionally be scaled by some means to produce a 1-bit signal.
  • the present invention and the various systems and methods described herein, preferably use either an RGB colour format using 1 bit for each of the R, G, B colour components, or a colour format using a luma component and multiple chroma components, where the luma component is encoded with 1 bit and the chroma components (colour difference components) are encoded with 1 bit or 2 bits each. Any of these formats are referred to herein by the term "one-bit video" or similar.
  • One-bit video is particularly suited to a novel method of chroma sub- sampling: in colour-space representations that separate chrominance and luminance (e.g. YCrCb), chroma sub-sampling can be implemented by sampling the chroma signals at a lower temporal rate, for example half the sampling rate used for the luminance (or luma) signal. This can be combined with conventional spatial chroma sub-sampling techniques to obtain a further advantage.
  • the above described temporal chroma sub-sampling technique may also be used with multi-bit sampled high frame rate signals
  • RGB->YCrCb may be implemented on a sample-by-sample basis using a probabilistic multiplication method, in which the weighted sum of the R, G and
  • B samples is used as the probability that the output luminance will be 1 , by comparing it to the output of a random number generator.
  • a one-bit high frame rate video system may also be used in still-frame photographical applications.
  • the stream of video samples may be averaged in different ways to produce the effect of selecting different exposure times in the device. In this way the trade-off between aperture
  • Still cameras use sophisticated tracking technologies to perform image stabilisation.
  • the use of a one-bit high frame rate system provides multiple samples in time, which may be used to track motion more directly on the sensor. In this way image stabilisation can be performed more robustly using simple correlation techniques.
  • the invention may additionally provide systems, apparatus, methods and computer program products to implement:
  • HDTV and Digital Cinema systems require 1.5Gb/s or more for uncompressed data transfer. This is already substantial, and compression is widely used in television and even film production. HFR systems will require correspondingly more data rate, equal to or greater than the 50Gb/s required by the emerging UHDTV standards. Compression is required in production and post-production, for transfer and storage of video, and for final delivery. Current production/post-production compression systems are intra-frame only, with compression ratios at most 10:1 for high quality video. This is insufficient for HFR video. Conventional video compression systems also provide inter-frame compression by means of motion compensation but such systems would also be limited by the very large number of motion vectors required: a set of vectors for every frame of video.
  • Embodiments may therefore include HFR compression systems using alternate techniques based on 3-dimensional transforms, exploiting the very great similarity between frames (in general, the higher the frame rate, the more similar adjacent frames are likely to be).
  • An example is illustrated in Figure 6.
  • an HFR source 99 provides frames of video, which are buffered in a video frame buffer 100. Data is extracted from the frame buffer in three dimensional blocks of samples for each video component (Y, U or V, or R, G or B) by the sample blocker 101. Each block may then be transformed into the frequency domain and quantised by means of a 3-D transform and quantiser 103 and then passed to an entropy coder 104.
  • the transform used in 103 may be a wavelet transform, or a Discrete Cosine Transform or a Lapped Orthogonal transform or some other transform.
  • the blocks of samples may overlap, spatially or temporally, with other blocks and a window function (for example a raised cosine function) may be applied to the blocks prior to quantisation and coding.
  • the system may be enhanced by using prediction from previously coded data, for example from spatially neighbouring samples or samples from other frame-groups, either earlier or (if frames are re-ordered prior to coding) later in time.
  • prediction data is subtracted from 3D data to be coded by a subtractor 102.
  • an inverse 3D transform and quantisation block 105 reconstructs the coded coefficients, and the prediction initially subtracted from the block is added back in by an adder 106.
  • the reconstructed samples are then fed into a sample deblocker 107 and thence into a buffer of reconstructed video frames 108, from which further predictions may be made.
  • a 3D prediction estimator 109 produces a prediction vector for a block from data currently in the input buffer 100 and reconstructed data in the prediction buffer 108, and the 3D prediction generator 110 produces a prediction block from the previously coded samples and the prediction vector, which will be subtracted from the latest set of sample blocks.
  • FIG. 7 A possible prediction method is illustrated in Figure 7, whereby reconstructed neighbouring samples in space and time are all available for use as a prediction.
  • a vector may extrapolate these values in any combination of vertical, horizontal and temporal directions to obtain a prediction block for the rectangular block of uncoded coefficients.
  • the video signal is captured, processed and transmitted to the end user at the high frame rate.
  • the video signal may be converted to a lower frame rate signal prior to transmission, with the higher frame rate signal used for video processing/editing.
  • a lower frame rate can be achieved by combining captured frames. Additional motion information generated from the high frame rate source signal can be used to improve the compression of the signal.
  • one embodiment provides a television system that captures video at a substantially over-sampled frame rate but then down-samples to produce video at a conventional television frame/field rate (the target frame rate), such as 50Hz or 60Hz.
  • a conventional television frame/field rate such as 50Hz or 60Hz.
  • a high frame rate (HFR) signal is obtained from an HFR source 80 (at a source frame rate), such as from a camera or from storage.
  • a video effect (for example a shuttering effect or lighting effect as described in more detail below) is then applied to the signal by effects processor 82.
  • a downsampler 84 generates a video output signal 86 at a target frame rate from the effects- processed HFR signal. By processing effects at the high frame rate, visual quality can be improved (even if the signal is subsequently down-sampled).
  • the processed signal could also be output to the end-user in HFR format as described in relation to the system of Figure 1.
  • the effect of camera shuttering can be applied as a post processing step rather than being fixed at the time of capture.
  • This process can be considered as the application of a multi-tap filter in the temporal domain, equivalent to the temporal shape of the desired shutter.
  • the applied shuttering effect could be equivalent to a camera shutter of any duration, less than, equal to, or greater than the duration of a frame at the target frame rate, enabling effects to be achieved such as motion blur, smooth motion, or crisp jagged (highly shuttered) motion.
  • temporal aliasing effects for example such as "wagon wheels" can be selectively eliminated or intentionally introduced.
  • the applied shuttering effect can be temporally shaped by using a non- rectangularly shaped temporal filter, i.e. not a form of direct average or accumulation, making possible shuttering effects not currently realisable with existing camera technology.
  • filters include exponential temporal filters; Gaussian temporal filters; sine function temporal filters; windowed sine function temporal filters; non-linear temporal filters such as median or rank order or morphological filters. Multiple filters may be combined.
  • the applied shuttering effect can be applied differentially across a scene, allowing, for example, an object of interest to appear crisp whilst the remainder of the scene appears blurred even though both may contain similar rates of motion. In a simple example, this may be achieved by varying, across the image, the number of frames of the input signal that contribute to a given pixel of an output frame. However, in more complex examples, any of the temporal filtering methods described above may be applied differentially.
  • shuttering may be applied differentially depending on the type of motion. Shuttering may also be applied differently to different colour components. Noise reduction
  • Temporal filtering can be applied differentially across a scene to vary the amount of noise reduction applied to different parts of the scene.
  • Such a temporal filter can also be generalised to a three dimensional convolution function to be applied to the video sequence, where the third dimension is temporal.
  • the convolution can be varied across the scene.
  • the convolution to be applied at any given point on any given moment could be shaped to integrate along the path of motion present at that location.
  • the effect of such a convolution is to apply filtering, such as noise reduction, along the path of motion.
  • a frequent problem with television systems where the frame rate differs from that of the mains electricity is that the brightness of the lighting varies between one frame and the next. If the frame rate is close to, but different from, the mains frequency, this may cause a beat between the two, causing a noticeable brightness flicker on the picture.
  • this flicker may be present but at too high a frequency to be noticeable to the human eye, but it may reduce the efficiency of a bit-rate reduction system. The removal of this effect is therefore desirable, and the following describes how it may be achieved.
  • this technique can be used to ensure that there is no flicker when video material shot under lighting at one mains frequency (for example, 50Hz) is converted for transmission or display at another rate (for example, 60Hz).
  • one mains frequency for example, 50Hz
  • another rate for example, 60Hz
  • the lighting level may be determined by calculating the mean luma level over certain parts or all of the image, and by means of a suitable band-pass filter on a signal representing the varying mean level (that is a part of the image which may be consistent in colour or hue but varying in luma or luminance level), a correction signal may be derived which when subtracted from all the luma values in that certain part or whole image will remove the periodic variation in luma caused by the fluctuations in illumination.
  • the low-pass element of the band-pass filter will remove changes in the real scene.
  • the high-pass element will remove noise or other unwanted components from the signal. For example this can be used to remove lighting variations from multiple sources, even if they are not in phase with each other or are not matched in frequency. Identifying light sources
  • the high speed continuous flicker of some sources of illumination is not normally perceivable by the human eye and brain or by conventional television systems.
  • a sufficiently high over-sampling frame rate (such as a frame rate that is twice as fast as the flicker rate, or faster) this can be detected by a television system and used to distinguish elements of the image illuminated by such a light source.
  • Several sources can potentially be distinguished if their characteristic flicker varies with respect to the other sources in phase and or frequency.
  • Filtering can also be applied to separate out the components of a scene's brightness due to the flicker of a particular illumination source, and then subtract that from the scene, or add it back in whilst also multiplying it by an arbitrary scalar.
  • Light sources can be deliberately offset in phase to enable them to be separated more easily by filtering. In the final temporally down- sampled video, this will create the effect of selectively removing a particular light source from the scene or varying its apparent brightness. This could be used, for example, to separate parts of the image illuminated by daylight, not flickering, from parts of the image illuminated with a light which may or may not be of a different colour temperature, but which flickers. A difference in colour temperature may also assist in this segmentation. This could, for example, be used to correct differences in white balance or colour shading within an image.
  • Instantaneous high speed flicker such as the flash from flash photography
  • the detection of the frame or frames or parts of the image illuminated by the flash can be detected by the fact that they have a higher than average luma or that the colours are desaturated or that the signal exhibits a degree of white clipping, or some combination of these.
  • the flash may either be partially or wholly removed by means of a median or other filter in time, or by reducing the luma level to that of the surrounding frames, or by completely removing the affected frame or frames or parts of the image and replacing them by interpolation from the surrounding frames.
  • the frame rate should preferably be sufficiently high so that no significant motion occurs between a flash- illuminated frame and neighbouring frames, so that the flash-illuminated frame may be discarded or interpolated.
  • a rate of 300fps should be sufficient for SD and HD video, but in some cases the rate may need to be higher for higher spatial resolutions.
  • An HFR signal is obtained from an HFR source 90.
  • Motion vector information is obtained by motion analysis 94.
  • the HFR signal is also temporally downsampled by down-sampler 92 to produce a sequence of images at the target frame rate.
  • the sequence of images at the target frame rate is compressed by compressor 96, using the motion vector information derived from the HFR signal to perform interframe compression, to produce compressed output 98.
  • Any suitable known compression algorithms e.g. an MPEG-based algorithm
  • motion information derived from the HFR signal may also be transmitted together with the low frame rate signal to a receiver (e.g. set-top box).
  • the receiver can then generate an output signal at a frame rate which is higher than the transmission frame rate using the motion information (e.g. by interpolation).
  • the receiver can generate additional frames using the motion information.
  • the output signal is then displayed e.g. on a television.
  • a receiver not adapted or configured to provide a higher frame rate signal may simply ignore the motion information and just output the low frame rate signal as received, thus maintaining compatibility with standard equipment.
  • Embodiments of the invention thus provide an improved conventional television system incorporating a camera that captures at a frame rate substantially higher than that used by a conventional television system, together with various production apparatuses that manipulate or utilise this high frame rate video, but which eventually down-sample temporally back to a conventional television frame rate.
  • Such production apparatuses may, for example, include:
  • An apparatus that temporally down-samples back to the frame rate of conventional television and applies a video compression scheme that includes inter-frame motion based compression techniques, where the required motion vectors needed for such a compression scheme are computed from the temporally over- sampled video.
  • the effects described above are described in the context of a system which performs downsampling after the effects have been applied, the described effects can also be implemented in a system in which video is provided to the end user at the high frame rate at which it is captured and processed, as previously described.
  • embodiments of the invention provide television/film capture, production and/or transmission systems operating at a frame rate which is substantially higher than existing conventional television systems.
  • Video processing and editing, and in particular effects processing may be performed at the substantially higher frame rate.
  • the video format used by the system may use a bit depth of one bit.
  • the high frame rate video may be down-converted to standard frame rates for transmission, or may be transmitted at the high frame rate to suitably modified or specially designed end-user equipment.
  • Motion information determined for a high frame-rate signal can be used to improve compression of a corresponding lower frame rate signal or to enable reproduction by a receiver of a higher framer rate signal from a lower frame rate signal.
  • Embodiments of the invention can provide the following advantages:-

Abstract

Television/film capture, production and/or transmission systems are disclosed operating at a frame rate which is substantially higher than existing conventional television systems. Video processing and editing, and in particular effects processing, may be performed at the substantially higher frame rate. The video format used by the system may use a bit depth of one bit. The high frame rate video may be down-converted to standard frame rates for transmission, or may be transmitted at the high frame rate to suitably modifiedor specially designedend-user equipment. Motion information determined for a high frame-rate signal can be used to improve compression of a corresponding lower frame rate signal or to enable reproduction by a receiver of a higher framer rate signal from a lower frame rate signal.

Description

Television system
The present invention relates in particular examples to methods and systems for providing and processing video and television signals. Recently, significant efforts have been made to provide video and television products and services with improved visual quality, for example by way of high-definition televisions, set-top-boxes and broadcasts, and high- definition video media such as HD-DVD and Blu-ray. However, these new products and services are often still based to a large extent on old video formats that give rise to certain quality limitations, for example in relation to the reproduction of fast motion. Motion blur and motion aliasing artefacts are examples of problems that can arise in these systems.
The present invention seeks to alleviate certain problems and limitations of prior art television and video processing systems. Accordingly, in a first aspect of the invention, there is provided a method of providing a digital video signal, comprising encoding the signal with a colour depth of at most two bits per pixel colour component. Preferably, the method comprises encoding the signal with a colour depth of one bit per colour component. The signal preferably has a frame rate substantially higher than conventional video, television or film frame rates.
In this way, a full colour image can be produced from the signal, either by conversion, or by displaying at the high frame rate. In particular, the signal preferably has a high frame rate selected such that, when displayed at the high frame rate, the effective colour depth as perceived by a viewer exceeds the actual encoded colour depth. In this way, increased temporal resolution can be traded against reduced colour resolution, without loss of overall colour information (or at least without substantial loss of overall colour information), because the colour information remains in the signal but is in effect temporally encoded, effectively as frequency information of 1 -bit or 2-bit colour depth colour components. Furthermore, due to the high frame rate, the quality of the video signal can be improved with regard to certain characteristics in particular in relation to representation of motion as will be explained later.
In a further aspect of the invention, there is provided a method of providing a digital video signal, comprising encoding the signal with a first colour depth at a high frame rate, the high frame rate selected such that, when displayed at the high frame rate, the effective colour depth as perceived by a viewer exceeds the first colour depth with which the signal was encoded.
In either of the above aspects, the effective colour depth as perceived by a viewer when the signal is displayed at the high frame rate preferably corresponds at least to the colour depth of an image encoded with 4 bits per colour component, preferably 5 bits per colour component, more preferably 8 bits per colour component. In other words, the temporally encoded colour information enables presentation or regeneration of an image from the low colour depth signal corresponding to a conventional colour encoding. The frame rate is preferably sufficiently high such that colour information can be recovered from the signal to produce an equivalent of an image encoded with at least 4 bits per colour component, preferably at least 5 bits per colour component, more preferably at least 8 bits per colour component. Preferably, the signal has a frame rate which is sufficiently high so that display of the signal at the high frame rate produces the effect of a full colour image on a viewer.
Advantageously, the signal has a frame rate of at least 5,000 fps, preferably at least 10,000fps, more preferably at least 20,000 fps. In certain embodiments, the signal preferably has a frame rate of at least 50,000fps, more preferably at least 100,000fps.
The method may comprise encoding the video signal using a given conventional video format having an associated conventional frame rate, using a frame rate higher than the conventional frame rate, the high frame rate selected such that the data rate of the encoded signal at the high frame rate does not exceed the data rate of a conventional video signal encoded using the conventional video format at the conventional frame rate. The method may comprise selecting the frame rate to enable transmission via a conventional video interface or channel, preferably an SDI (Serial Digital Interface) interface or channel. The high frame rate may be one of 8, 10, 256 times or 1024 times the selected conventional television or film frame rate.
The signal is preferably encoded using one bit for any primary colour component or luma component and at most two bits for any chroma component. The signal may be encoded using, for each pixel, a plurality of colour components corresponding to primary colours, each colour component encoded using (exactly) one bit. For example, the signal may be encoded in RGB colour space using one bit for each of the red, green and blue components. Alternatively, the signal may be encoded in a format having a luma component and a plurality of (for example two) chroma (or colour difference) components, the encoding using one bit for the luma component and at most two bits for each chroma component. For example chroma components may each be encoded with sign and magnitude bits. Alternatively, the luma and chroma components may each be encoded with (exactly) one bit.
Preferably, the method comprises converting the signal to a second signal having a second colour depth greater than that of the signal. This may comprise performing spatial and/or temporal averaging of pixel values to obtain pixel values at the second colour depth. The method may also comprise compressing the converted video signal (for example using a conventional compression algorithm). The term "pixel value" as used herein encompasses individual colour component values making up a pixel.
The method may comprise processing the video signal to perform image stabilisation based on correlation or motion estimation between successive frames.
In a further aspect of the invention, there is provided a method of processing a video signal, comprising: receiving an input video signal defining a sequence of temporal image units (e.g. frames or fields) encoded with a first colour depth (preferably one or two bits per colour component as set out above); and generating an image unit (e.g. frame or field) encoded with a second colour depth greater than the first colour depth from image units (frames or fields) of the sequence, the generating comprising: aggregating source pixel values for a plurality of frames or fields in the sequence to produce output pixel values for the generated frame or field. The plurality of frames or fields (or other image units) are preferably adjacent frames or fields in the sequence, but may alternatively be non-adjacent frames or fields. The image units are preferably frames. The input video signal is preferably at a comparatively high frame rate as set out above. The output pixel values preferably correspond to an average of the source pixel values quantized to the second colour depth. The pixel values may, for example, comprise separate colour components, with aggregation performed for each colour component. The method may comprise generating a sequence of frames or fields with the second colour depth, each frame or field generated from a respective group of adjacent frames or fields of the input video signal.
The respective groups of adjacent frames or fields may or may not overlap. Preferably, each group of adjacent frames defines a window of a given number of frames or fields selected in relation to a given time instant, with an output frame or field generated for that time instant from the frames in the window. The windows for successive output frames can overlap in which case an individual source frame may contribute to multiple output frames. The window progresses along the signals time axis as successive output frames are generated. This allows, for example, a sequence of full colour depth frames to be generated at the same frame rate as the original low colour depth video signal. Alternatively the windows (groups) may be discrete in which case the output signal may have a reduced frame rate (e.g. a conventional frame rate). Preferably the method comprises, for each input frame or field of the input video signal: generating an output frame or field in the output signal based on a group of frames or fields in the input signal including said input frame or field. Preferably, the aggregating is performed on one or more groups of adjacent frames or fields, and the number of frames or fields in the or each group is preferably selected in accordance with a variable parameter. In this way, a variable artificial shuttering effect can be applied to the signal as explained in more detail later. The input video signal in this aspect is preferably a video signal as provided by any method set out above.
The above methods and features may, for example, be implemented in an image capturing device such as a digital still or video camera.
Accordingly, in a further aspect, the invention provides an image capturing device comprising: image sensor means for outputting a sequence of image frames with a first colour depth of one bit per colour component; and means for generating an output image with a colour depth greater than the first colour depth from a group of (preferably adjacent) frames of the sequence, the generating means adapted to aggregate source pixel values for the group of frames to generate output pixel values for the output image.
Preferably, the image capturing device comprises means for receiving a parameter defining a virtual shutter speed; wherein the group of frames used to generate the output image is selected to extend over a given time period in accordance with the received parameter. The image sensor means is preferably adapted to output the sequence of image frames at a given frame rate, and the generating means is preferably adapted to select a number of adjacent frames for use in generating the output image in accordance with the received parameter. In this way, an artificial shuttering effect can be implemented, in which the parameter defines the shutter duration.
The device may be used to acquire a still image. Alternatively, the generating means may be adapted to generate a video signal comprising a plurality of output image frames with a colour depth greater than the first colour depth.
The device may further comprise means for analysing image frames of the sequence of image frames to track motion and/or perform image stabilisation, wherein image stabilisation is preferably performed on the generated image.
The invention also provides apparatus or a television system having means for performing any method as set out herein or having means for processing, storing or transmitting a digital video signal as provided by any method as set out herein.
In a further aspect of the invention, there is provided a television system adapted to provide, process or use a television signal having a frame rate substantially higher than conventional television and/or film frame rates. Preferably, the system comprises means for transmitting the signal over a transmission medium to a plurality of television receiver devices. The signal may be transmitted at the high frame rate or as a converted signal having a lower frame rate which is derived from the high frame rate signal as is explained in more detail later. The signal may be encoded with a bit depth of at most two bits per colour component, preferably one bit per colour component, as set out above.
By using frame rates substantially higher than conventional television frame or field rates, video image quality can be improved, in particular in relation to the portrayal of fast motion and motion artefacts.
Specifically, the frame rate is preferably substantially higher than one or more (preferably each) of: PAL frame or field rate, NTSC frame or field rate, and standard film frame rate. Such frame rates (ranging from 24fps for film to 30fps for NTSC) have been in use widely for decades (indeed since the very beginning of film and television) and have generally been felt to be adequate since they allow for the reproduction of apparently smooth moving images. This, together with the high data rates required for encoded video, has led to a prejudice in the art for choosing such relatively low frame rates. However, underlying the present invention is the realisation that significant quality gains can be obtained by using much higher frame rates, without necessarily requiring a proportional increase in data rates.
Advantageously, to achieve some of these improvements, the frame rate is at least 80fps, preferably at least 100fps, more preferably at least 150fps. Even higher frame rates may lead to correspondingly greater improvements in video quality. Accordingly, the frame rate is advantageously at least 300fps, preferably at least 600fps, more preferably at least 1200fps. Still higher frame rates may be used, for example at least 2500fps or at least 5000fps. Some embodiments use frame rates of at least 10,000fps or even at least 100,000fps. Examples of such embodiments are described below in connection with a one-bit colour component encoding.
Preferably, the frame rate is a multiple of one or more selected conventional television frame or field rates or film rates. The frame rate may be a multiple of each of a plurality of selected conventional television frame or field rates or film rates. This enables easier conversion between the relevant conventional rates and the high frame rate.
Specifically, the frame rate may be a multiple of one or more, preferably each, of: 25fps PAL frame rate, 30fps approximate NTSC rate and 24fps standard film rate. Preferably, the frame rate is a multiple of one or more, preferably each of: 50 fields-per-second PAL field rate, 60 fields-per- second approximate NTSC field rate, and 24fps standard film rate.
Preferably, to allow easier conversion to and from all of the most widely used conventional formats, the frame rate is a multiple of each of the above (24fps, 25fps, 30fps), and thus is preferably 600fps or a multiple thereof.
The term television signal preferably refers to a video signal which can be broadcast on a television network, but may also include such a signal as recorded by a camera and/or processed in a production/editing system prior to broadcast. The signal preferably defines a sequence of images displayable at the frame rate. In other words, a sequence of images (or frames) for display at the frame rate and at a given display resolution are encoded in the signal (though the encoding may use compression including, for example, frame prediction, so that each frame need not necessarily be encoded pixel-by- pixel). The signal is preferably a digital television signal, more preferably a digital television broadcast signal for broadcast on a digital television network.
Different colour components of the signal may be sampled at different temporal rates. For example, to reduce the data rate, the method may comprise sampling one or more chroma components at a lower temporal rate than a luma (or luminance) component, preferably at half the luma sampling rate or less, more preferably at a quarter the luma sampling rate or less. Spatial chroma sub-sampling may be used in addition to temporal chroma sub-sampling.
The signal may be encoded with a bit depth of at most two bits per colour component, preferably with one bit per colour component or with one- bit plus a sign bit (depending on the colour space used). A frame rate of at least 10,000fps may be used in this case. Given a sufficiently high frame rate, this can enable efficient encoding and processing of the signal.
The system preferably comprises means for providing source video at the high frame rate, the providing means preferably including one or more cameras operable to capture source video at the high frame rate. Other high frame rate sources may be used, for example a high frame rate animation system. The source video may form the television signal or may be incorporated into the television signal (for example by editing to combine with other sources).
The system may also comprise means for performing video editing or video processing at the high frame rate (on the high frame rate signal or to produce the high frame rate signal). Preferably, the system comprises means for performing video effects processing on the high frame rate signal to add a video effect to the high frame rate signal, preferably a shuttering effect or a lighting effect. By using a high frame rate, certain new video processing effects can be implemented. The system preferably comprises means for transmitting the high frame rate signal to end user equipment, the end user equipment preferably adapted to output the signal at the high frame rate. The system may include the end user equipment, which may comprise one or more televisions, set top boxes, cinema display systems, personal computers, mobile devices, or the like. The transmitting means preferably includes a broadcast system for broadcasting the television signal on a broadcast network, such as a digital terrestrial, cable or satellite network.
The television system may thus include one or more video cameras or other sources, video production/editing system(s), transmission system(s) and end user equipment all operating at the high frame rate, thus providing a complete high frame rate television production, distribution and display system.
The system may comprise means for compressing the signal, preferably using three-dimensional block-based coding, preferably using temporal and/or spatial prediction of three-dimensional blocks of video data.
The system may comprise means for converting the television signal at the high frame rate to a low frame rate signal, preferably at a standard television frame rate or film rate, in which case the system may further provide means for transmitting the low frame rate signal to end user equipment. This can enable interoperability with conventional end user equipment which is not compatible with the higher frame rate.
The system may comprise means for deriving information for use in compression from the high frame rate signal, and means for compressing the low frame rate signal using the derived information. Specifically, the system may comprise means for deriving motion information from the high frame rate signal and means for compressing the low frame rate signal using the derived motion information. This can improve compression efficiency and quality, allowing the low frame rate signal to benefit from the additional information available in the high frame rate source signal.
The system may alternatively or additionally comprise means for deriving motion information from the high frame rate signal, and means for transmitting the motion information with the low frame rate signal to a receiver. The receiver (e.g. end user equipment) may be adapted, using the received motion information, to output a signal at a frame rate higher than the low frame rate at which the signal was transmitted, preferably at the original high frame rate. This is another way in which reproduction quality of the low frame rate signal may be improved using information derived from the high quality signal. In further aspects corresponding to the above, the invention provides apparatus having means for processing a television signal having a frame rate substantially higher than conventional television or film frame rates, and a method of providing a television signal having a frame rate substantially higher than conventional television or film frame rates, both optionally with corresponding preferred features. The apparatus may be a digital television receiver or set-top box, or a display, for displaying the signal at the high frame rate, a camera for capturing the signal at the high frame rate, or a video processing/editing apparatus for processing/editing video at the high frame rate. The invention also provides a corresponding video editing system comprising means for editing video at a frame rate substantially higher than conventional television or film frame rates, again optionally with corresponding preferred features.
Some further aspects and features relating to the low colour depth encoding discussed above will now be set out. In one aspect of the invention, there is provided a method of providing a digital video signal, comprising encoding the signal with a bit depth of at most two bits per colour component. As previously mentioned, the encoding may use one bit plus a sign bit for one, some or all colour components and/or may use a single bit for one, some or all colour components. In case of monochrome video, there may only be a single colour component for each pixel, whilst a colour image will have multiple (typically three) colour components per pixel.
Preferred embodiments use a single bit to encode each colour component. The video signal is preferably colour video. In this way, a simple encoding scheme can be provided, which can enable more efficient processing of the signal. The signal preferably has a frame rate substantially higher than conventional television or film frame rates. Advantageously, the signal has a frame rate which is sufficiently high so that display of the signal at the high frame rate produces the effect of a full colour image on a viewer (due to the integration performed by the eye the one-bit nature of each frame becomes substantially imperceptible, i.e. the viewer sees colours that are not actually present at any given instant in time because the pixels are switched too rapidly to be perceived separately). The term "full colour image" preferably means an image having a colour range corresponding at least to conventional standard or high definition television or video formats.
Preferably, the signal has a frame rate of at least 10,000fps, more preferably at least 100,000fps. In some cases, frame rates of at least 250,000fps, at least 500,000fps or even at least 1 ,000,000fps may be used. By using an extremely high frame rate of for example at least 10,000fps the effect of a colour image can be created despite the 1 -bit sampling used.
The method may further comprise converting between the one-bit (or one-bit plus sign) video format and a video format using multiple (more) bits per colour component, preferably a standard television or film format. The invention also correspondingly provides a camera adapted to record a digital video signal with a bit depth of at most two bits, preferably one bit per colour component (or one-bit plus sign as described above), and a television receiver or display adapted to receive, decode, output and/or display a digital video signal having a bit depth of at most two bits, preferably one bit (or one-bit plus sign) per colour component. These aspects may include corresponding preferred features.
Similarly, the invention provides a video editing system comprising means for editing video encoded with a bit depth of at most two bits, preferably one bit per colour component (or one-bit plus sign), preferably at a frame rate substantially higher than conventional television or film frame rates, preferably at least 10,000fps, more preferably at least 100,000fps.
The invention also provides a video conversion system comprising means for converting between a first video format having a high frame rate and a bit depth of at most two bits, preferably one bit per colour component (or one bit plus sign) and a second video format having a lower frame rate and a bit depth of multiple (or more) bits per colour component. The second video format is preferably a standard television or film format, preferably a PAL or NTSC or HDTV television format. In a further aspect, the invention provides a method of providing a video signal, comprising sampling at least one colour component of the signal at a different temporal rate to one or more other colour components. Preferably, one or more chroma components are sampled at a lower temporal rate than a luma (or luminance) component, preferably at half the luma sampling rate or less, more preferably at a quarter the luma sampling rate or less. Spatial chroma sub-sampling may be used in addition to temporal chroma sub-sampling. The signal preferably has a frame rate substantially higher than conventional rates as described above and/or may be encoded with two, preferably one, bits per colour component as described above. In a further aspect, the invention provides a method of providing a compressed video signal, comprising: receiving a video signal at a first frame rate; deriving information for use in compression from the video signal at the first rate; converting the video signal at the first frame rate to a video signal at a second frame rate different from the first frame rate; and compressing the video signal at the second frame rate using the derived information. The second frame rate is preferably lower than the first frame rate. In this way, compression quality and efficiency for the lower frame rate signal can be improved using information from the higher frame rate signal. The information is preferably motion prediction or motion vector information (e.g. one or more motion vectors) for use in interframe compression of the video signal. The video signal may be encoded using a bit depth of one bit or another encoding described herein.
In a further aspect, the invention provides a method of transmitting a video signal, comprising: receiving a video signal at a high frame rate; deriving motion information from the high frame rate video signal; converting the high frame rate video signal to a low frame rate video signal; transmitting the low frame rate video signal and the motion information to a receiver; and at the receiver, deriving from the received low frame rate signal a signal at an output frame rate higher than the low frame rate at which the signal was transmitted using the received motion information, and outputting the derived signal at the output frame rate. In this way, display quality may be improved for a lower frame rate signal. The output frame rate may be equal to the high frame rate.
In a further aspect, the invention provides a method of processing video, comprising: receiving a video signal at a high frame rate, the high frame rate being substantially higher than conventional television frame rates; and performing video processing at the high frame rate to produce a processed video signal at the high frame rate. Preferably, the method comprises converting the processed video signal to a low frame rate lower than the high frame rate, preferably a standard television or film frame rate (e.g. 24fps, 25fps or 30fps); and outputting the low frame rate processed video signal. This can enable a variety of processing techniques and effects that might not be possible if starting from a low frame rate signal. Also, by performing video processing in the high frame rate domain prior to down-converting to a lower frame rate, image quality can in some cases be improved.
The step of performing video processing may comprise performing video editing. Video editing at high frame rate can enable greater editing accuracy. Alternatively or additionally, video processing may comprise performing effects processing, preferably to apply a shuttering or lighting effect.
For example, the step of performing video processing may comprise applying a synthetic camera shuttering effect to the video signal.
In a further aspect of the invention, there is provided a method of processing video, comprising: receiving a video signal at a given (preferably high) frame rate, the given frame rate preferably being substantially higher than conventional television or film frame rates; and processing the signal to apply a synthetic camera shuttering effect to the video signal.
Applying a camera shuttering effect may comprise applying one or more temporal filters to the video signal. The one or more temporal filters may include one or more of: a non-rectangular temporal filter; an exponential temporal filter; a Gaussian temporal filter; a sine function temporal filter; a windowed sine function temporal filter; a non-linear temporal filter such as a median or rank order or morphological filter. The shuttering effect may be applied differentially across the video image. The shuttering effect may be applied differentially depending on the type of motion. Shuttering may be applied differentially for different colour components.
As a further example, the step of performing video processing may comprise detecting the effect on a scene (as represented in one or more frames) caused by a variable light source based on the characteristic rate and/or phase of variation of the light source, and may optionally comprise modifying the scene accordingly.
Specifically, the step of performing video processing may comprise identifying regions of or objects in one or more video images where elements of the scene are illuminated by a given light source by detecting the characteristic rate and/or phase of flicker of the light source. The method may further comprise modifying one or more video images or identified objects or regions thereof to modify, reduce or remove illumination caused by the light source. The method may comprise distinguishing between multiple light sources based on respective flicker rates or phases of the light sources.
The method may also comprise processing the video image to remove a short instantaneous flash from a particular light source (e.g. due to flash photography or lightning). The high frame rate can thus enable detection of image contributions which might not be detectable at conventional frame rates.
The method may comprise detecting the one or more frames or parts of frames illuminated by the flash based on one or more of: higher than average luma; desaturation of colours; a degree of white clipping exhibited by the signal. Preferably, the method comprises removing the flash at least partially by a process including one or more of: applying a median filter or other filter in time to the detected frames or parts of frames; reducing the luma level of the detected frames or parts of frames to that of surrounding frames or parts of frames; replacing one or more affected frames or parts of frames of the image with frames or parts of frames generated by interpolation from surrounding frames or parts of frames. The term surrounding may here mean, in relation to parts of frames, spatially or temporally adjacent or near; and, in relation to whole frames, temporally adjacent or near.
The step of performing video processing may also comprise filtering a video sequence. Filtering may be applied differentially across an image. Temporal filtering may be performed. In one example, filtering may be performed by applying a three dimensional convolution function to the video sequence, wherein the convolution function is shaped to filter along the trajectory of motion present in the video sequence. In a further aspect, the invention provides a method of processing video to detect or modify the contribution to a scene from a regularly varying light source, the method comprising: capturing the video at a frame rate which is greater than the rate of variation of the light source; and detecting a contribution to one or more frames of video due to the light source based on the rate and/or phase of variation of the light source; and optionally further comprising modifying one or more of the frames of video based on the detected contribution. The frame rate is preferably substantially greater than the rate of variation or flicker rate of the light source, preferably at least twice the flicker rate, more preferably at least four times or at least ten times the flicker rate. The frame rate may be as described above. The method may comprise modifying the frame or frames to modify, increase, reduce or remove the detected contribution due to the light source.
The frame rate of the source video is preferably at least twice the rate of variation (the flicker rate or frequency) of the light source. Multiple light sources may be detected/processed, in which case the frame rate is preferably at least twice the maximum frequency of the light sources.
The contribution may be detected in part based on differences in colour temperature. Modifying one or more frames may comprise correcting differences in white balance or colour shading within the one or more frames. The invention also provides a method of compressing a video signal, preferably (though not necessarily) having a frame rate substantially higher than conventional television or film frame rates, the video signal having two spatial dimensions and a temporal dimension, the method comprising: dividing the signal into a plurality of three-dimensional blocks of video data; and encoding the three-dimensional blocks. Encoding a block preferably comprises calculating a frequency domain transform of the block; and quantising the resulting transform coefficients. The method preferably comprises calculating prediction information for a block, and encoding the block using the prediction information. The method preferably comprises performing spatial and/or temporal prediction. Using such a three-dimensional compression process, in particular for a high frame rate signal, can improve compression efficiency by exploiting the high degree of similarity between frames. The invention also provides a method for decompressing a signal compressed using such a compression process.
The invention also provides a video processing system or apparatus adapted to perform any method as described herein, and a computer program or computer program product comprising software code adapted, when executed on a data processing apparatus, to perform any method as described herein.
Where reference is made above (and in the appended claims) to "means for" performing some act, such means may, for example, include a processor and associated memory, suitably programmed to perform the act. More generally, the invention also provides a computer program and a computer program product for carrying out any of the methods described herein and/or for embodying any of the apparatus features described herein, and a computer readable medium having stored thereon a program for carrying out any of the methods described herein and/or for embodying any of the apparatus features described herein. The invention also provides a signal embodying a computer program for carrying out any of the methods described herein and/or for embodying any of the apparatus features described herein, a method of transmitting such a signal, and a computer product having an operating system which supports a computer program for carrying out any of the methods described herein and/or for embodying any of the apparatus features described herein.
The invention extends to methods and/or apparatus substantially as herein described with reference to the accompanying drawings. Any feature in one aspect of the invention may be applied to other aspects of the invention, in any appropriate combination. In particular, method aspects may be applied to apparatus aspects, and vice versa.
Furthermore, features implemented in hardware may generally be implemented in software, and vice versa. Any reference to software and hardware features herein should be construed accordingly.
Preferred features of the present invention will now be described, purely by way of example, with reference to the accompanying drawings, in which:- Figure 1 illustrates a high frame rate television production and broadcast system in overview;
Figure 2 illustrates a process of capturing, processing and displaying a high frame rate video signal;
Figure 3 illustrates low frame rate and high frame rate signals; Figure 4 illustrates a process for high frame rate video processing and downsampling;
Figure 5 illustrates a video compression process using a high frame rate video source;
Figure 6 illustrates a method of video compression; Figure 7 illustrates three-dimensional (spatial/temporal) block coding and prediction; and
Figure 8 illustrates 1 -bit image capture.
High frame rate video system A high frame rate television/film production system is shown in overview in Figure 1.
The system comprises a high frame rate camera 12 for providing source video and a high frame rate editing system 16 for editing the captured video. Storage 14 is provided for storing unedited and edited video content, and may include disc storage, tape storage, and/or any other suitable storage media.
Editing system 16 may make use of a variety of editing/post-production components, such as an animation generator 18, effects unit 20, graphics generator 22 and video mixer 24. Edited or raw video content is transmitted using a broadcast subsystem 26 to a broadcast network 30, such as a digital or analogue terrestrial, satellite or cable network. Video content may also be made accessible via a media server 28 to be accessed over a data network, for example the Internet 32. Transmitted content is received from the broadcast network or Internet by end-user devices such as a set-top box 34, personal computer 36 or cinema display system 38.
All the elements of the system are adapted to operate at a substantially higher frame rate than is used in current conventional television/film systems, as is described in more detail below.
Figure 1 shows representative elements of such a television system. A given system may contain only some of the elements shown and may contain other elements not shown. Only one of each element is shown, though the system will typically include multiple such elements, e.g. multiple cameras, set-top boxes, effects processors etc. The system will typically also include compression/decompression functionality to enable compression of digital video data for storage and transmission.
Embodiments of the invention provide a high frame/field rate television and film system, in which the quality of the reproduced image is improved by the use of increased frame and/or field rates throughout the entire system.
Thus, video or film pictures are captured at a rate in excess of the conventional 24/50/60 frames/fields per second and are stored, transmitted and reproduced at this higher frame/field rate.
For example, the system could run at a rate of 300 frames per second. In that case all the cameras 12, editing system components (e.g. animation generators 18, graphics generators 22, video mixers 24 and effects units 20) would run at 300 frames per second to create an output which is stored or distributed as a 300 frames per second video signal. The transmission system then broadcasts this high frame rate signal to the home or to cinemas or other places for decoding and display, again at the higher frame rate. The high frame rate signal can also be recorded onto fixed media such as disc/tape or downloaded file for sale to the home or onto other storage media for distribution to cinemas or other venues. The resulting TV pictures can give far better rendition of motion and of moving detail than current TV and film systems, in particular by reducing motion blur. The reduction in motion blur can greatly improve the sharpness of anything in the picture which is moving relative to the camera/display. The removal of motion blur can also improve the sense of reality of the image, removing the tendency for moving objects to become transparent. The sharpening of moving edges can lead to an improved occlusion of backgrounds improving the sense of the image being real and three- dimensional. The resulting TV images can also have lower levels of motion aliasing.
This can apply at the level of object structure, such as the effect of wagon wheels appearing to be going backwards, and at the level of image structure, where an edge moves across the sampling structure. The lower levels of temporal aliasing can also lead to improved compression efficiencies so that the increase in frame rate does not necessarily lead to proportionately higher storage requirements, as both lossy and lossless compression rates can be improved as the frame rate is increased.
Increased frame rates at the camera and display can also reduce other flicker and motion related effects such as photosensitive and pattern-sensitive epilepsy.
Video effects processing (e.g. by animation generator 18, effects unit 20, graphics generator 22 and video mixer 24) can be performed at the higher frame rate to achieve higher-quality effects. Video editing can be performed (e.g. by editing system 16) more accurately due to the increased temporal resolution.
Instead of camera 12, other video sources may be used. For example, a 3D animation/rendering system may provide the source video at the high frame rate.
The frame rate chosen is preferably substantially higher (preferably many times higher) than the conventional film / television frame or field rates. Such conventional rates include 24fps film rate; 25fps television (e.g. PAL), 30fps/29.97fps television (e.g. NTSC); and corresponding field rates of 50 / 60(59.94) fields-per-second. For many practical purposes, the NTSC rate of 29.97fps can be approximated to 30fps; for example, standards conversion can be achieved by accelerating a 29.97fps signal to 30fps prior to performing further processing. Thus, references herein to a 30fps rate are intended to also include the standard NTSC rate of 29.97fps (and references to 60 fields- per-second field rates similarly include the standard NTSC field rate of 59.94 fields-per-second).
Some of the benefits of the high frame rate system can be realised at frame rates of, for example, 65 or 70fps (for example, the improvements in the portrayal of motion can in some respects be proportional to the frame rate). However, other advantages, such as the lower delay in video processing systems and improved lip synch would not be significant until much higher frame rates were used (for example in excess of 100fps).
For a practical system, a frame rate of at least 80fps is therefore preferably used. Preferably, the frame rate is at least 100fps, more preferably at least 150 fps, more preferably at least 300fps. In some embodiments, an even higher frame rate may be selected, for example at least 600fps, at least 900fps or even at least 1200fps.
Preferably, the frame rate chosen has a simple mathematical relationship with one or more selected conventional frame rates. Preferably, the chosen frame rate is a multiple of at least one selected conventional frame rate or field rate. This simplifies conversion between a conventional signal and a high frame rate signal. More preferably, the frame rate is a multiple of several conventional frame/field rates, to improve interoperability with selected existing systems. The rate can be chosen based on requirements, i.e. based on the conventional systems with which interoperability is desired. For example, a frame rate may be selected which is a multiple of both conventional television frame rates of 25 and 30, e.g. 150fps, or which is a multiple of both conventional television field rates of 50 and 60, e.g. 300fps or 600fps to allow conversion and interoperability with existing PAL/NTSC systems. A frame rate with a simple mathematical relationship with (e.g. being an exact multiple of) each of the television frame/field rates of 25fps/30fps and 50 and 60 fields-per-second and also the conventional film rate of 24fps is particularly preferred, such as 600fps or a multiple thereof, since this simplifies conversion between film sources as well as NTSC, PAL and HDTV systems.
The frame information for the high frame rate signal may be encoded in any suitable way, for example using conventional 8-bit or 10-bit per colour component encoding in RGB, YUV, YCrCb or any other suitable colour space. Conventional compression techniques may also be used to reduce the data rate of the signal, including intraframe and interframe compression (e.g. using motion prediction).
In summary, embodiments of the invention thus provide an improved television, video and/or film system utilising higher frame/field rates than current systems. The system comprises a camera and display, connected together via a link capable of capturing, conveying and displaying pictures at rates higher than current television, video and film rates. This can be attached to storage for the recording of these pictures or connected to a link for live broadcast of these pictures.
Further embodiments provide:
• A production system capable of editing, mixing and otherwise manipulating these pictures at the higher frame/field rates.
• A link or broadcast system capable of sending these high frame/field rate pictures to other locations and a receiver.
• A storage system capable of recording the signals from the camera and production system and playing them back for further manipulation or broadcasting.
The use of a very high frame rate source video signal enables a range of improved video processing, editing and compression techniques. For example, embodiments of the invention may provide:
• A system that produces motion information from the high frame rate captured pictures and then transmits those pictures at a lower frame rate along with the motion information such that pictures a higher frame rate than transmitted can be reproduced at the receiver or display device (this is discussed in more detail below) • High frame/field rates which have a simple mathematical relationship with both 50 and 60 fps (e.g. 300 fps) can allow simple standards conversion to both field rates. (A frame/field rate of 600fps would also provide simple conversion to 24fps.) • A video/film editing system which gives the opportunity to make finer grained edits than conventional video/film editing systems, i.e. giving an editing accuracy down to a single frame at the higher frame rate.
• A system of volumetric coding for video compression which takes advantage of the improved temporal sampling, eliminating conventional motion compensation and using volumetric and directional transformations.
• A system of noise reduction which takes advantage of the much greater similarity of adjacent frames, by means of thresholding or quantisation of coefficients in three-dimensional transform domains.
• A system of noise reduction which takes advantage of the higher frame/field rate to improve the ability to differentially reduce noise in different parts of the picture • A system of noise reduction which takes advantage of the higher frame/field rate to improve the ability to reduce noise in moving objects by filtering along the direction of motion
Capturing video at very high frame rates in conventional artificial lighting conditions may be problematic in that light intensity may vary in accordance with the phase of 50Hz or 60Hz mains power sources. This could introduce marked illumination differences between frames and "beat notes" in the resulting video which could have a temporal frequency on only a few hertz due to timing differences between the video and the light sources.
Embodiments may therefore also include: • A set of temporal-spatial filters to remove colour and level differences between frames in conventional artificial lighting
• A system of illumination using a direct current source or an alternating current source at a much higher rate than the camera frame rate or an alternating current source synchronised with the camera frame rate or some other means, such that the variation in scene illumination between frames is minimised.
High frame-rate video with one-bit sampling
A further embodiment will now be described which provides a television system whereby the quality of the reproduced image is improved by the use of one bit sampling of the image at very high frame rates.
In this embodiment, video or film pictures are captured at a bit-depth of one bit per colour component at a rate greatly in excess of the current 24/50/60 frames/fields per second, and are stored, transmitted and reproduced using the one bit representation. The representation may use RGB, YUV, YCrCb or any other suitable colour space, with each component sampled with a one-bit bit depth. The frame rate is selected to be sufficiently high so that the integration performed by the human eye renders the one-bit nature of each frame imperceptible. To achieve this, the 1 -bit system preferably uses a frame rate of at least around 10,000fps. Some embodiments use a frame rate between 100,000fps and 1 ,OOO,OOOfps, although still higher frame rates could also be used. The system is illustrated in overview in Figure 2. A high frame rate camera 40 captures a high frame rate 1 -bit signal 42. Processing 44 may optionally be performed. The high frame rate 1 -bit signal is then provided to a suitably adapted display 46. Alternatively, conversion stage 48 may be provided to convert the signal to a conventional low frame rate signal suitable for a standard display 50. The system may include the various other elements discussed previously in relation to Figure 1.
This embodiment shares many advantages with the previously described embodiments, for example in relation to the realism of the images and the rendition of motion, in particular the reduction of motion blur and motion aliasing artefacts. The use of one bit sampling and reproduction can also reduce and possibly eliminate other flicker and motion related effects such as photosensitive and pattern-sensitive epilepsy.
The system can be a good match to certain modern display technologies such as Plasma Display Panels or back- and front-projected systems based on Digital Mirror Devices, as only the driving electronics would need to be changed to make them one-bit video compatible.
According to one embodiment, an improved television, video and/or film system utilising bit-depths of one bit and much higher frame/field rates than current systems is provided.
The system includes a display capable of displaying video frames with a bit-depth of one bit (per colour component) at a frame rate much higher than that of conventional television and cinema: high enough that the integration performed by the human eye renders the one-bit nature of each frame imperceptible. Plasma Display Panels and Digital Mirror Devices are examples of technologies that could be used to implement such a display.
Specifically, in such a display, instead of displaying a pixel with different brightness values for each of a number of colour components (typically RGB for digital displays), each colour component simply has two states: on or off. However, due to the very high frame rate of the signal, the pixel component may be switched on and off rapidly to produce the effect of intermediate brightness values. This effect occurs since the human eye is unable to resolve the transitions at the very high frame rates used. Different switching frequencies of a colour component then correspond to different brightness values for the colour component.
The viewer in effect sees a full colour image (i.e. an image providing a greater colour range than a static single-bit-per-component image could actually provide), preferably substantially corresponding in colour range at least to a conventional television image. The system may further include a television camera containing a sensor capable of capturing, digitising and outputting video frames with a bit- depth of one bit, at a rate equal to or greater than the display rate described above. A conventional digital video camera that supports image capture at a sufficiently high frame rate could be used to acquire one-bit video merely by applying a threshold to each pixel value and making appropriate use of dithering or error feedback techniques. Alternatively, an image sensor specifically adapted for 1 -bit sampling may be used. An unshuttered image capture device may be used. For example, capture of 1 -bit video can use delta-sigma modulation, used as an analogue-to-digital converter (ADC) where the 1 -bit video signal can be considered as pulse-density modulation (the level over a given time period is given by the number of pulses or non-zero values over that period). In such a system, each pixel (or pixel colour component) in the sensor accumulates charge in proportion to the light that has fallen on it, and may be considered as an integrator circuit. This is connected with a threshold detector which determines whether the pixel charge is greater than a pre-determined threshold. The detector output a pulse of size equal to the threshold if the charge is greater than the threshold, and this signal is output. At the same time, the output of the threshold detector is subtracted from the pixel level, so that its level is that of the excess of its original level over the threshold, and any further amount that has accumulated in any intervening time.
This is illustrated in Figure 8, which shows pixel accumulater 122 providing an accumulated pixel value to threshold detector 124. If the accumulated pixel value exceeds the threshold, a '1 ' is output (128) and the threshold value is subtracted by subtractor 120. Otherwise a '0' is output. The input (126) is light level and the output (128) is a pulse train.
Note that the threshold detector will work on a clock (130) equal to the 1 -bit sample rate.
In practice the subtractor and threshold detector will preferably be integrated with the sensor pixel, and will convert a proportion of the charge on the sensor to an electical pulse signal if (and only if) the charge exceeds the threshold. Figure 3 illustrates an example of a high-frame-rate signal 64 including a sequence of frames at a high rate. A given pixel 66 of a frame's array of pixels comprises RGB colour components encoded using a single bit each. A multi-bit sampled low frame rate signal 60 is also shown by way of comparison. The system also includes a digital link or broadcast system with a bandwidth sufficient to carry the data being output by the camera and/or fed into the display in a compressed or uncompressed form. The system may also incorporate the following: • A production system capable of editing, mixing and otherwise manipulating video in the one-bit format.
• A storage system capable of recording the signals from the camera and production system and playing them back for further manipulation or broadcasting.
• A conversion system such as a system based on the principles of delta-sigma modulation to enable conventional film and video pictures to be converted to the one-bit video format, and video in the one-bit format to be down converted to current film and TV frame/field rates. This is discussed in more detail below.
• The use of a data-rate reduction system to reduce the bandwidth required by the system. For example, run-length coding could be employed. Other, more complex compression techniques may also be used. • A video/film editing system which gives the opportunity to make finer grained edits in time than conventional video/film editing systems, i.e. giving an editing accuracy down to a single frame at the higher frame rate of the one-bit video format. A higher spatial resolution than conventional television formats may also be used for recording and/or display of the one-bit format.
Conversion between one-bit and conventional video formats could be accomplished in a number of ways. Existing conversion techniques for converting between PCM (pulse code modulation) and PDM (pulse density modulation) signals can be extended into the video domain. For example, to go from conventional video to a one-bit representation, the video signal can be put through a video standards-converter to increase its frame rate to that of the one-bit representation, and then the pixel signals (the signals corresponding to the values of each pixel as a function of time in the upconverted video stream) put through a delta-sigma modulator. To go from a one-bit representation to a conventional video representation, successive pixel values can be summed over a period corresponding to the length of a frame in the conventional representation, with the resulting pixel signals put through independent low-pass filters. Video captured in one-bit format at a higher frame or field rate is not necessarily equivalent to video captured with higher bit depth at a lower frame or field rate, since typical shuttering in cameras discards some of the information that will be captured in a one-bit system. One bit video can allow for a fully flexible trade-off between bit depth/dynamic resolution and temporal and spatial resolution. It therefore supports very high spatial resolution video such as Ultra High Definition Television (UHDTV), currently restricted to a maximum of 60 frames per second, to be displayed at a much higher frame rate, but using less dynamic resolution. This restores a more appropriate balance between temporal and spatial resolution for these standards.
Conversion from one-bit to multi-bit conventional video effects an averaging of the video signal along a temporal axis. By choosing the start point for each temporal averaging window, full bit depth frames at different points in time may be constructed. Thus by performing a "running average" along the temporal axis, full bit depth frames at the higher frame or field rate can be obtained, to enhance playback especially for slow-motion effects.
One bit video with standard definition or high definition spatial dimensions captured at 8 or 10 times the frame or field rate may provide no increase to the overall data rate over normal SD or HD signals. Such a signal could be captured, stored and transmitted using standard video formats such as Serial Digital Interface (SDI) and the accompanying widely deployed SDI infrastructure. It could also, after conversion to multi-bit format, be compressed using conventional video coding methods, systems and standards. The difference between one-bit and multi-bit video would usually only become apparent, in this case, upon display with a one-bit capable display. A conventional display would display the video with reduced temporal resolution but increased dynamic resolution. If required, both temporal and spatial averaging could be applied in the conversion process.
In colour spaces such as YCrCb where signals can take negative values, those signals may be represented with either one or two bits per sample; in the former case, a signal with a time-average of 0.5 represents the zero point; in the latter case one of the two bits encodes the signal's sign. Such a two-bit implementation may be substituted wherever reference is made herein to a one-bit encoding. Negative values can arise in YCrCb, YUV or similar colour spaces because of matrixing: Y or luma is a weighted average of R, G and B; Cb and
U are scaled versions of B-Y; and Cr and V are scaled versions of R-Y. In other words, Cr and Cb (and similarly U an V) are colour difference components.
So if a 1 -bit RGB source signal is assumed in which each of the R, G, B colour components take values of 0 or 1 , then the average of the RGB components, Y, can also take values of 0 or 1. This means, however, that the colour difference components R-Y and B-Y can take values -1 , 0 or 1 and thus require more than 1 bit.
It may be noted that in standard 8 or 10 bit television signals, the colour difference components Cb and Cr also have an extra bit initially; however, the scaling factors applied to Cb and Cr are used to reduce the dynamic range of the colour components from 9 or 11 bits back down to 8 or 10 bits. Similarly, in the 1-bit system we have the option of keeping the colour difference components as two bits, or scaling them to reduce them to 1 bit. For example, this can be done by treating the 2-bit signal as if it is an analogue signal itself, and choosing a threshold determined by a scaling factor to get a 1 -bit signal. To summarise the above, a 1-bit capture format will provide 1 -bit data for each of the captured primary colours. Matrixing this will naturally produce a 1 -bit signal for luma and 2-bit signals for other components. The other components (colour difference or chroma components) may also optionally be scaled by some means to produce a 1-bit signal. Thus, the present invention, and the various systems and methods described herein, preferably use either an RGB colour format using 1 bit for each of the R, G, B colour components, or a colour format using a luma component and multiple chroma components, where the luma component is encoded with 1 bit and the chroma components (colour difference components) are encoded with 1 bit or 2 bits each. Any of these formats are referred to herein by the term "one-bit video" or similar.
One-bit video is particularly suited to a novel method of chroma sub- sampling: in colour-space representations that separate chrominance and luminance (e.g. YCrCb), chroma sub-sampling can be implemented by sampling the chroma signals at a lower temporal rate, for example half the sampling rate used for the luminance (or luma) signal. This can be combined with conventional spatial chroma sub-sampling techniques to obtain a further advantage. However, the above described temporal chroma sub-sampling technique may also be used with multi-bit sampled high frame rate signals
(e.g. using conventional bit depths).
Conversion between colour spaces ("matrix conversions") in the one-bit video domain that involve multiplication by non-integer scaling factors (e.g.
RGB->YCrCb) may be implemented on a sample-by-sample basis using a probabilistic multiplication method, in which the weighted sum of the R, G and
B samples is used as the probability that the output luminance will be 1 , by comparing it to the output of a random number generator.
A one-bit high frame rate video system may also be used in still-frame photographical applications. In this application, the stream of video samples may be averaged in different ways to produce the effect of selecting different exposure times in the device. In this way the trade-off between aperture
(sharpness), depth of field and dynamic resolution can be made purely in post-processing and changed at will subsequently.
Still cameras use sophisticated tracking technologies to perform image stabilisation. The use of a one-bit high frame rate system provides multiple samples in time, which may be used to track motion more directly on the sensor. In this way image stabilisation can be performed more robustly using simple correlation techniques.
The invention may additionally provide systems, apparatus, methods and computer program products to implement:
• Carriage of 1-bit or 2-bit video over conventional video infrastructure, especially SDI, by choosing a frame rate 8 or 10 times that of a conventional video frame rate
• Extension of the temporal resolution of conventional video formats by capture and display using a one or two bit video format at a frame rate 256 or 1024 times that of a conventional video frame rate • Construction of a high colour depth (or high dynamic range) frame or field at any temporal point at which a 1-bit frame is captured, by performing averaging of neighbouring frames starting from or centred on the frame or field in question. • Construction of a high colour depth (or high dynamic range) signal at the full high frame rate by performing a running average of a 1-bit or 2-bit high frame rate video signal
• Compression and delivery of a high frame rate one bit or two bit video signal by conversion to a multi-bit format, by means of temporal and/or spatial averaging and the use of conventional video compression standards and apparatus
• Capture of a still frame of video in a photographic device using a one-bit high frame rate sensor capturing a sequence of bits over a number of time intervals. • Selecting different temporal averaging to achieve the effect of different shuttering intervals in a camera, for both still image and video capture.
• Use of a high frame rate one bit video signal to capture a still frame or moving video sequence and perform image stabilisation using low complexity correlation/global motion estimation between successive samples in time.
Compression and delivery of high frame rate video
Conventional HDTV and Digital Cinema systems require 1.5Gb/s or more for uncompressed data transfer. This is already substantial, and compression is widely used in television and even film production. HFR systems will require correspondingly more data rate, equal to or greater than the 50Gb/s required by the emerging UHDTV standards. Compression is required in production and post-production, for transfer and storage of video, and for final delivery. Current production/post-production compression systems are intra-frame only, with compression ratios at most 10:1 for high quality video. This is insufficient for HFR video. Conventional video compression systems also provide inter-frame compression by means of motion compensation but such systems would also be limited by the very large number of motion vectors required: a set of vectors for every frame of video.
Embodiments may therefore include HFR compression systems using alternate techniques based on 3-dimensional transforms, exploiting the very great similarity between frames (in general, the higher the frame rate, the more similar adjacent frames are likely to be). An example is illustrated in Figure 6. Here an HFR source 99 provides frames of video, which are buffered in a video frame buffer 100. Data is extracted from the frame buffer in three dimensional blocks of samples for each video component (Y, U or V, or R, G or B) by the sample blocker 101. Each block may then be transformed into the frequency domain and quantised by means of a 3-D transform and quantiser 103 and then passed to an entropy coder 104. The transform used in 103 may be a wavelet transform, or a Discrete Cosine Transform or a Lapped Orthogonal transform or some other transform. The blocks of samples may overlap, spatially or temporally, with other blocks and a window function (for example a raised cosine function) may be applied to the blocks prior to quantisation and coding.
The system may be enhanced by using prediction from previously coded data, for example from spatially neighbouring samples or samples from other frame-groups, either earlier or (if frames are re-ordered prior to coding) later in time. Here prediction data is subtracted from 3D data to be coded by a subtractor 102. After transform and quantisation an inverse 3D transform and quantisation block 105 reconstructs the coded coefficients, and the prediction initially subtracted from the block is added back in by an adder 106. The reconstructed samples are then fed into a sample deblocker 107 and thence into a buffer of reconstructed video frames 108, from which further predictions may be made. A 3D prediction estimator 109 produces a prediction vector for a block from data currently in the input buffer 100 and reconstructed data in the prediction buffer 108, and the 3D prediction generator 110 produces a prediction block from the previously coded samples and the prediction vector, which will be subtracted from the latest set of sample blocks.
A possible prediction method is illustrated in Figure 7, whereby reconstructed neighbouring samples in space and time are all available for use as a prediction. A vector may extrapolate these values in any combination of vertical, horizontal and temporal directions to obtain a prediction block for the rectangular block of uncoded coefficients.
The above-described compression method and system may be provided as an independent aspect of the invention.
Effects processing and downsampling of high frame rate video In the above examples, the video signal is captured, processed and transmitted to the end user at the high frame rate. In alternative embodiments, the video signal may be converted to a lower frame rate signal prior to transmission, with the higher frame rate signal used for video processing/editing. For example, a lower frame rate can be achieved by combining captured frames. Additional motion information generated from the high frame rate source signal can be used to improve the compression of the signal.
Thus, one embodiment provides a television system that captures video at a substantially over-sampled frame rate but then down-samples to produce video at a conventional television frame/field rate (the target frame rate), such as 50Hz or 60Hz. Such a system affords improved video quality and also the advantage of allowing a range of effects to be achieved in post-production, including effects which would conventionally be achieved using the shutter of the camera apparatus. The process is illustrated in Figure 4.
A high frame rate (HFR) signal is obtained from an HFR source 80 (at a source frame rate), such as from a camera or from storage. A video effect (for example a shuttering effect or lighting effect as described in more detail below) is then applied to the signal by effects processor 82. A downsampler 84 generates a video output signal 86 at a target frame rate from the effects- processed HFR signal. By processing effects at the high frame rate, visual quality can be improved (even if the signal is subsequently down-sampled).
Instead of downsampling the signal, the processed signal could also be output to the end-user in HFR format as described in relation to the system of Figure 1. Some examples of effects will now be described. Shuttering effects in post processing
As one example, by the use of an effectively unshuttered camera, the effect of camera shuttering can be applied as a post processing step rather than being fixed at the time of capture. This process can be considered as the application of a multi-tap filter in the temporal domain, equivalent to the temporal shape of the desired shutter. These artistic decisions can be made at leisure during the post production process, rather than having to be made at the time of video acquisition. The applied shuttering effect could be equivalent to a camera shutter of any duration, less than, equal to, or greater than the duration of a frame at the target frame rate, enabling effects to be achieved such as motion blur, smooth motion, or crisp jagged (highly shuttered) motion. As a further example, temporal aliasing effects, for example such as "wagon wheels", can be selectively eliminated or intentionally introduced.
The applied shuttering effect can be temporally shaped by using a non- rectangularly shaped temporal filter, i.e. not a form of direct average or accumulation, making possible shuttering effects not currently realisable with existing camera technology. Examples of filters that may be used include exponential temporal filters; Gaussian temporal filters; sine function temporal filters; windowed sine function temporal filters; non-linear temporal filters such as median or rank order or morphological filters. Multiple filters may be combined.
The applied shuttering effect can be applied differentially across a scene, allowing, for example, an object of interest to appear crisp whilst the remainder of the scene appears blurred even though both may contain similar rates of motion. In a simple example, this may be achieved by varying, across the image, the number of frames of the input signal that contribute to a given pixel of an output frame. However, in more complex examples, any of the temporal filtering methods described above may be applied differentially.
In another example, shuttering may be applied differentially depending on the type of motion. Shuttering may also be applied differently to different colour components. Noise reduction
Temporal filtering can be applied differentially across a scene to vary the amount of noise reduction applied to different parts of the scene.
Such a temporal filter can also be generalised to a three dimensional convolution function to be applied to the video sequence, where the third dimension is temporal. The convolution can be varied across the scene. The convolution to be applied at any given point on any given moment could be shaped to integrate along the path of motion present at that location. The effect of such a convolution is to apply filtering, such as noise reduction, along the path of motion.
Certain lighting-related effects can also be achieved using high frame rate video processing as will now be described.
Removal of Lighting Flicker
A frequent problem with television systems where the frame rate differs from that of the mains electricity is that the brightness of the lighting varies between one frame and the next. If the frame rate is close to, but different from, the mains frequency, this may cause a beat between the two, causing a noticeable brightness flicker on the picture.
In a simple high frame rate television system, this flicker may be present but at too high a frequency to be noticeable to the human eye, but it may reduce the efficiency of a bit-rate reduction system. The removal of this effect is therefore desirable, and the following describes how it may be achieved.
Additionally, this technique can be used to ensure that there is no flicker when video material shot under lighting at one mains frequency (for example, 50Hz) is converted for transmission or display at another rate (for example, 60Hz).
To remove lighting variation, if the video has been captured at a frame rate significantly higher than the lighting flicker rate, the lighting level may be determined by calculating the mean luma level over certain parts or all of the image, and by means of a suitable band-pass filter on a signal representing the varying mean level (that is a part of the image which may be consistent in colour or hue but varying in luma or luminance level), a correction signal may be derived which when subtracted from all the luma values in that certain part or whole image will remove the periodic variation in luma caused by the fluctuations in illumination. The low-pass element of the band-pass filter will remove changes in the real scene. The high-pass element will remove noise or other unwanted components from the signal. For example this can be used to remove lighting variations from multiple sources, even if they are not in phase with each other or are not matched in frequency. Identifying light sources
The high speed continuous flicker of some sources of illumination, such as fluorescent lighting, is not normally perceivable by the human eye and brain or by conventional television systems. However with a sufficiently high over-sampling frame rate (such as a frame rate that is twice as fast as the flicker rate, or faster) this can be detected by a television system and used to distinguish elements of the image illuminated by such a light source. Several sources can potentially be distinguished if their characteristic flicker varies with respect to the other sources in phase and or frequency.
Filtering can also be applied to separate out the components of a scene's brightness due to the flicker of a particular illumination source, and then subtract that from the scene, or add it back in whilst also multiplying it by an arbitrary scalar. Light sources can be deliberately offset in phase to enable them to be separated more easily by filtering. In the final temporally down- sampled video, this will create the effect of selectively removing a particular light source from the scene or varying its apparent brightness. This could be used, for example, to separate parts of the image illuminated by daylight, not flickering, from parts of the image illuminated with a light which may or may not be of a different colour temperature, but which flickers. A difference in colour temperature may also assist in this segmentation. This could, for example, be used to correct differences in white balance or colour shading within an image.
Instantaneous high speed flicker, such as the flash from flash photography, can be detected and filtered out by applying the same or other filtering techniques whilst introducing few visual artefacts. For example the detection of the frame or frames or parts of the image illuminated by the flash can be detected by the fact that they have a higher than average luma or that the colours are desaturated or that the signal exhibits a degree of white clipping, or some combination of these. Depending on the nature of the frame or frames or parts of the image the flash may either be partially or wholly removed by means of a median or other filter in time, or by reducing the luma level to that of the surrounding frames, or by completely removing the affected frame or frames or parts of the image and replacing them by interpolation from the surrounding frames.
To detect an individual flash, the frame rate should preferably be sufficiently high so that no significant motion occurs between a flash- illuminated frame and neighbouring frames, so that the flash-illuminated frame may be discarded or interpolated. A rate of 300fps should be sufficient for SD and HD video, but in some cases the rate may need to be higher for higher spatial resolutions.
Improved compression of conventional video
By using an HFR source signal, more accurate motion vectors can also be computed for a video sequence to improve the efficiency of motion based inter-frame video compression of the down-sampled (conventional frame rate) video. The process is illustrated in Figure 5.
An HFR signal is obtained from an HFR source 90. Motion vector information is obtained by motion analysis 94. The HFR signal is also temporally downsampled by down-sampler 92 to produce a sequence of images at the target frame rate. The sequence of images at the target frame rate is compressed by compressor 96, using the motion vector information derived from the HFR signal to perform interframe compression, to produce compressed output 98. Any suitable known compression algorithms (e.g. an MPEG-based algorithm) may be used and modified appropriately to make use of the motion vector information derived from the HFR signal.
As an alternative or in addition to using motion information to improve compression of a temporally downsampled signal, motion information derived from the HFR signal may also be transmitted together with the low frame rate signal to a receiver (e.g. set-top box). The receiver can then generate an output signal at a frame rate which is higher than the transmission frame rate using the motion information (e.g. by interpolation). Specifically, the receiver can generate additional frames using the motion information. The output signal is then displayed e.g. on a television. A receiver not adapted or configured to provide a higher frame rate signal may simply ignore the motion information and just output the low frame rate signal as received, thus maintaining compatibility with standard equipment.
Embodiments of the invention thus provide an improved conventional television system incorporating a camera that captures at a frame rate substantially higher than that used by a conventional television system, together with various production apparatuses that manipulate or utilise this high frame rate video, but which eventually down-sample temporally back to a conventional television frame rate. Such production apparatuses may, for example, include:
• An apparatus that temporally down-samples back to the frame rate of conventional television and applies a video compression scheme that includes inter-frame motion based compression techniques, where the required motion vectors needed for such a compression scheme are computed from the temporally over- sampled video.
• the application of multi-tap filtering in the temporal domain, differentially across the image, when down-sampling to a conventional television frame rate, to synthesise a variety of camera shuttering effects, including many not achievable using an ordinary physical camera's shutter.
• the application of temporal filtering to distinguish objects or regions of video images where elements of the scene are illuminated by different light sources by detecting the characteristic flicker rate of the light source, whilst also removing that flicker when temporally down-sampling.
• the application of temporal filtering to separate out the aspects of an image's brightness due to a particular light source with a characteristic rate and/or phase of flicker, and then scale that component arbitrarily.
• the application of temporal filtering to separate out the aspects of an image's brightness due to a short instantaneous flash from a particular light source, such as flash photography, allowing that component to be scaled arbitrarily or completely removed.
• An apparatus that applies a three dimensional convolution function to a video sequence where the particular function is shaped to filter along the trajectory of motion present at any particular point
Though the effects described above (for example shutter effects or lighting effects) are described in the context of a system which performs downsampling after the effects have been applied, the described effects can also be implemented in a system in which video is provided to the end user at the high frame rate at which it is captured and processed, as previously described.
In summary, embodiments of the invention provide television/film capture, production and/or transmission systems operating at a frame rate which is substantially higher than existing conventional television systems.
Video processing and editing, and in particular effects processing, may be performed at the substantially higher frame rate. The video format used by the system may use a bit depth of one bit. The high frame rate video may be down-converted to standard frame rates for transmission, or may be transmitted at the high frame rate to suitably modified or specially designed end-user equipment. Motion information determined for a high frame-rate signal can be used to improve compression of a corresponding lower frame rate signal or to enable reproduction by a receiver of a higher framer rate signal from a lower frame rate signal.
Embodiments of the invention can provide the following advantages:-
• Better representation of motion and sharper moving detail (possibly to the point where there is little or no difference between the rendition of static and moving detail) and hence a more realistic look due to the much higher frame rate compared to conventional television and cinema systems.
• Improved sense of three-dimensionality through sharper moving edges and more solid occlusion. • A reduction of temporal aliasing due to the much higher frame rate compared to conventional television systems, reducing the "backwards-rotating wagon-wheel" and similar effects.
• Reduction of the perceptible flicker, judder and jerkiness associated with current TV and film frame/field rates and hence an expected reduction in related physiological and medical conditions such as photosensitive and pattern-sensitive epilepsy, eye-strain, headaches and nausea.
• Full backwards compatibility with conventional television cameras and displays can be achieved, albeit possibly by sacrificing to some extent the aforementioned better representation of motion.
• Simpler conversion to both 50fps and 60fps conventional television formats, making complicated standards-converters unnecessary.
• Increased tolerance of noise within individual frames, permitting an optional reduction in the number of bits used to represent each pixel without reducing the perceived image quality of the video. • Lower delay in video processing systems such as standards- converters, graphics and effects units.
• Improved lip-sync
• Increased artistic freedom for directors and camera operators, allowing the use of shots made impossible at present by the current frame/field rates of television and cinema, such as fast pans and scenes containing fast-moving objects moving across the image.
It will be understood that the present invention has been described above purely by way of example, and modification of detail can be made within the scope of the invention.

Claims

1. A method of providing a digital video signal, comprising encoding the signal with a colour depth of at most two bits per pixel colour component.
2. A method according to claim 1 , comprising encoding the signal with a colour depth of one bit per colour component.
3. A method according to claim 1 or 2, wherein the signal has a frame rate substantially higher than conventional television or film frame rates.
4. A method according to any of the preceding claims, wherein the signal has a high frame rate selected such that, when displayed at the high frame rate, the effective colour depth as perceived by a viewer exceeds the actual encoded colour depth.
5. A method of providing a digital video signal, comprising encoding the signal with a first colour depth at a high frame rate, the high frame rate selected such that, when displayed at the high frame rate, the effective colour depth as perceived by a viewer exceeds the first colour depth with which the signal was encoded.
6. A method according to claim 4 or 5, wherein the effective colour depth as perceived by a viewer when the signal is displayed at the high frame rate corresponds at least to the colour depth of an image encoded with 4 bits per colour component, preferably 5 bits per colour component, more preferably 8 bits per colour component.
7. A method according to any of the preceding claims, wherein the signal has a frame rate which is sufficiently high so that display of the signal at the high frame rate produces the effect of a full colour image on a viewer.
8. A method according to any of the preceding claims, wherein the signal has a frame rate of at least 5,000 fps, preferably at least 10,000fps, more preferably at least 20,000 fps
9. A method according to any of the preceding claims, wherein the signal has a frame rate of at least 50,000fps, preferably at least 100,000fps.
10. A method according to any of the preceding claims, comprising encoding the video signal using a given conventional video format having an associated conventional frame rate, using a frame rate higher than the conventional frame rate, the high frame rate selected such that the data rate of the encoded signal at the high frame rate does not exceed the data rate of a conventional video signal encoded using the conventional video format at the conventional frame rate.
11. A method according to any of the preceding claims, comprising selecting the frame rate to enable transmission via a conventional video interface or channel, preferably an SDI (Serial Digital Interface) interface or channel.
12. A method according to claim 10 or 11 , wherein the high frame rate is one of 8, 10, 256 times or 1024 times the selected conventional television or film frame rate.
13. A method according to any of the preceding claims, wherein the signal is encoded using one bit for any primary colour component or luma component and at most two bits for any chroma component.
14. A method according to any of the preceding claims, wherein the signal is encoded using, for each pixel, a plurality of colour components corresponding to primary colours, each colour component encoded using one bit.
15. A method according to claim 14, wherein the signal is encoded in RGB colour space using one bit for each of the red, green and blue components.
16. A method according to any of the preceding claims, wherein the signal is encoded in a format having a luma component and a plurality of chroma components, the encoding using one bit for the luma component and at most two bits for each chroma component.
17. A method according to claim 16, wherein chroma components are each encoded with sign and magnitude bits.
18. A method according to claim 16, wherein the luma and chroma components are each encoded with one bit.
19. A method according to any of the preceding claims, comprising converting the signal to a second signal having a second colour depth greater than that of the signal.
20. A method according to claim 19, comprising performing spatial and/or temporal averaging of pixel values to obtain pixel values at the second colour depth.
21. A method according to claim 19 or 20, comprising compressing the converted video signal.
22. A method according to any of the preceding claims, comprising processing the video signal to perform image stabilisation based on correlation or motion estimation between successive frames.
23. A method of processing a video signal, comprising: receiving an input video signal defining a sequence of frames or fields encoded with a first colour depth; and generating a frame or field encoded with a second colour depth greater than the first colour depth from frames or fields of the sequence, the generating comprising: aggregating source pixel values for a plurality of (preferably adjacent) frames or fields in the sequence to produce output pixel values for the generated frame or field.
24. A method according to claim 23, wherein the output pixel values correspond to an average of the source pixel values quantized to the second colour depth.
25. A method according to claim 23 or 24, comprising generating a sequence of frames or fields with the second colour depth, each frame or field generated from a respective group of adjacent frames or fields of the input video signal.
26. A method according to claim 25, wherein the respective groups of adjacent frames or fields overlap.
27. A method according to claim 25 or 26, comprising, for each input frame or field of the input video signal: generating an output frame or field in the output signal based on a group of frames or fields in the input signal including said input frame or field.
28. A method according to any of claims 23 to 27, wherein the aggregating is performed on one or more groups of adjacent frames or fields, and wherein the number of frames or fields in the or each group is preferably selected in accordance with a variable parameter.
29. A method according to any of claims 23 to 28, wherein the input video signal is a video signal as provided by a method of any of claims 1 to 18.
30. A camera adapted to capture a digital video signal with a bit depth of at most two bits per colour component.
31. A television receiver or display adapted to receive, decode, output and/or display a digital video signal having a bit depth of at most two bits per colour component.
32. A video editing system comprising means for editing video encoded with a bit depth of at most two bits bit per colour component, preferably at a frame rate substantially higher than conventional television or film frame rates, preferably at least 10,000fps, more preferably at least 100,000fps.
33. A video conversion system comprising means for converting between a first video format having a high frame rate and a bit depth of at most two bits per colour component and a second video format having a lower frame rate and a bit depth of more than two bits per colour component, preferably, wherein the second video format is a standard television or film format, preferably a PAL, NTSC or HDTV television format.
34. An image capturing device comprising: image sensor means for outputting a sequence of image frames with a first colour depth of one bit per colour component; and means for generating an output image with a colour depth greater than the first colour depth from a group of (preferably adjacent) frames of the sequence, the generating means adapted to aggregate source pixel values for the group of frames to generate output pixel values for the output image.
35. An image capturing device according to claim 34, comprising: means for receiving a parameter defining a virtual shutter speed; wherein the group of frames used to generate the output image is selected to extend over a given time period in accordance with the received parameter.
36. An image capturing device according to claim 35, wherein the image sensor means is adapted to output the sequence of image frames at a given frame rate, and wherein the generating means is adapted to select a number of adjacent frames for use in generating the output image in accordance with the received parameter.
37. An image capturing device according to any of claims 34 to 36, wherein the generating means is adapted to generate a video signal comprising a plurality of output image frames with a colour depth greater than the first colour depth.
38. An image capturing device according to any of claims 34 to 37, comprising means for analysing image frames of the sequence of image frames to track motion and/or perform image stabilisation.
39. Apparatus having means for performing a method as claimed in any of claims 1 to 29 or having means for processing, storing or transmitting a digital video signal as provided by the method of any of claims 1 to 29.
40. A television system having means for transmitting a digital television signal having a frame rate substantially higher than conventional television or film frame rates over a transmission medium to a plurality of television receiver devices.
41. A system according to claim 40, wherein the signal is encoded with a bit depth of at most two bits per colour component, preferably one bit per colour component.
42. A system according to claim 40 or 41 , wherein the frame rate is substantially higher than one or more (preferably each) of: PAL frame or field rate, NTSC frame or field rate, and standard film frame rate.
43. A system according to any of claims 40 to 42, wherein the frame rate is at least 80fps, preferably at least I OOfps more preferably at least 150fps.
44. A system according to any of claims 40 to 43, wherein the frame rate is at least 300fps, preferably at least 600fps, more preferably at least 1200fps.
45. A system according to any of claims 40 to 44, wherein the frame rate is a multiple of one or more selected conventional television frame or field rates or film rates, preferably wherein the frame rate is a multiple of each of a plurality of selected conventional television frame or field rates or film rates.
46. A system according to claim 45, wherein the frame rate is a multiple of one or more, preferably each, of: 25fps PAL frame rate, 30fps approximate NTSC rate and 24fps standard film rate; more preferably wherein the frame rate is a multiple of one or more, preferably each of: 50 fields-per-second PAL field rate, 60 fields-per-second approximate NTSC field rate, and 24fps standard film rate.
47. A system according to any of claims 40 to 46, wherein the frame rate is 600fps or a multiple thereof.
48. A system according to any of claims 40 to 47, wherein the signal defines a sequence of images displayable at the frame rate.
49. A system according to any of claims 40 to 48, comprising means for providing source video at the high frame rate, the providing means preferably including one or more cameras operable to capture source video at the high frame rate.
50. A system according to any of claims 40 to 49, comprising means for performing video editing or video processing at the high frame rate.
51. A system according to any of claims 40 to 50, comprising means for performing video effects processing on the high frame rate signal to add a video effect to the high frame rate signal, preferably a shuttering effect or a lighting effect.
52. A system according to any of claims 40 to 51 , comprising means for transmitting the high frame rate signal to end user equipment adapted to output the signal at the high frame rate.
53. A system according to any of claims 40 to 52, comprising means for compressing the signal using three-dimensional block-based coding, preferably using temporal and/or spatial prediction of three-dimensional blocks of video data.
54. A system according to any of claims 40 to 53, comprising means for converting the television signal at the high frame rate to a low frame rate signal, preferably at a standard television frame rate or film rate.
55. A system according to claim 54, comprising means for transmitting the low frame rate signal to end user equipment.
56. A system according to claim 54 or 55, comprising means for deriving information for use in compression from the high frame rate signal, and means for compressing the low frame rate signal using the derived information.
57. A system according to claim 56, comprising means for deriving motion information from the high frame rate signal and means for compressing the low frame rate signal using the derived motion information.
58. A system according to any of claims 54 to 57, comprising means for deriving motion information from the high frame rate signal, and means for transmitting the motion information with the low frame rate signal to a receiver.
59. A system according to clam 58, wherein the receiver is adapted, using the received motion information, to output a signal at a frame rate higher than the low frame rate at which the signal was transmitted, preferably at the original high frame rate.
60. A method of providing a compressed video signal, comprising: receiving a video signal at a first frame rate; deriving information for use in compression from the video signal at the first rate; converting the video signal at the first frame rate to a video signal at a second frame rate different from the first frame rate; and compressing the video signal at the second frame rate using the derived information.
61. A method according to claim 60, wherein the second frame rate is lower than the first frame rate.
62. A method according to claim 60 or 61 , wherein the information is motion vector information for use in interframe compression of the video signal.
63. A method according to any of claims 60 to 62, wherein the video signal is encoded using a bit depth of at most two bits, preferably one bit per colour component.
64. A method of transmitting a video signal, comprising: receiving a video signal at a high frame rate; deriving motion information from the high frame rate video signal; converting the high frame rate video signal to a low frame rate video signal; transmitting the low frame rate video signal and the motion information to a receiver; and at the receiver, deriving from the received low frame rate signal a signal at an output frame rate higher than the low frame rate at which the signal was transmitted using the received motion information, and outputting the derived signal at the output frame rate.
65. A method according to claim 64, wherein the output frame rate is equal to the high frame rate.
66. A method of processing video, comprising: receiving a video signal at a high frame rate, the high frame rate being substantially higher than conventional television or film frame rates; and performing video processing at the high frame rate to produce a processed video signal at the high frame rate.
67. A method according to claim 66, further comprising: converting the processed video signal to a low frame rate lower than the high frame rate, preferably a standard television or film frame rate; and outputting the low frame rate processed video signal.
68. A method according to claim 66 or 67, wherein the step of performing video processing comprises performing video editing.
69. A method according to any of claims 66 to 68, wherein the step of performing video processing comprises performing effects processing, preferably to apply a shuttering or lighting effect.
70. A method according to any of claims 66 to 69, wherein the step of performing video processing comprises applying a synthetic camera shuttering effect to the video signal.
71. A method of processing video, comprising: receiving a video signal at a high frame rate, the high frame rate preferably being substantially higher than conventional television or film frame rates; and processing the high frame rate signal to apply a synthetic camera shuttering effect to the video signal.
72. A method according to claim 70 or 71 , wherein applying a camera shuttering effect comprises applying one or more temporal filters to the video signal.
73. A method according to claim 72, wherein the one or more temporal filters include one or more of: a non-rectangular temporal filter; an exponential temporal filter; a Gaussian temporal filter; a sine function temporal filter; a windowed sine function temporal filter; a non-linear temporal filter such as a median or rank order or morphological filter.
74. A method according to any of claims 70 to 73, wherein the shuttering effect is applied differentially across the video image.
75. A method according to claim 74, wherein the shuttering effect is applied differentially depending on the type of motion.
76. A method according to claim 74 or 75, wherein shuttering is applied differentially for different colour components.
77. A method according to any of claims 66 to 76, wherein the step of performing video processing comprises detecting the effect on a scene caused by a variable light source based on the characteristic rate and/or phase of variation of the light source.
78. A method according to claim 77, wherein the step of performing video processing comprises identifying regions or objects of one or more video images where elements of the scene are illuminated by a given light source by detecting the characteristic rate and/or phase of flicker of the light source.
79. A method according to claim 77 or 78, comprising modifying one or more video images or identified objects or regions thereof to modify, reduce or remove illumination caused by the light source.
80. A method according to any of claims 77 to 79, comprising distinguishing between multiple light sources based on respective flicker rates or phases of the light sources.
81. A method according to any of claims 66 to 80, comprising processing the video image to remove a short flash from a particular light source.
82. A method according to claim 81 , comprising detecting the one or more frames or parts of frames illuminated by the flash based on one or more of: higher than average luma; desaturation of colours; a degree of white clipping exhibited by the signal.
83. A method according to claim 81 or 82, comprising removing the flash at least partially by a process including one or more of: applying a median filter or other filter in time to the detected frames or parts of frames; reducing the luma level of the detected frames or parts of frames to that of surrounding frames or parts of frames; replacing one or more affected frames or parts of frames of the image with frames or parts of frames generated by interpolation from surrounding frames or parts of frames.
84. A method according to any of claims 66 to 83, wherein the step of performing video processing comprises applying a three dimensional convolution function to the video sequence, wherein the convolution function is shaped to filter along the trajectory of motion present in the video sequence.
85. A method of processing video to modify the contribution to a scene from a regularly varying light source, the method comprising: capturing the video at a frame rate which is greater than the rate of variation of the light source; detecting a contribution to one or more frames of video due to the light source based on the rate and/or phase of variation of the light source; and modifying one or more of the frames of video based on the detected contribution.
86. A method according to claim 85, comprising modifying the frame or frames to modify, increase, reduce or remove the detected contribution due to the light source.
87. A method according to claim 85 or 86, wherein the contribution is detected in part based on differences in colour temperature.
88. A method according to any of claims 85 to 87, wherein modifying one or more frames comprises correcting differences in white balance or colour shading within the one or more frames.
89. A method of compressing a video signal preferably having a frame rate substantially higher than conventional television or film frame rates, the video signal having two spatial dimensions and a temporal dimension, the method comprising: dividing the signal into a plurality of three-dimensional blocks of video data; and encoding the three-dimensional blocks.
90. A method according to claim 89, wherein encoding a block comprises calculating a frequency domain transform of the block; and quantising the resulting transform coefficients.
91. A method according to claim 89 or 90, comprising calculating prediction information for a block, and encoding the block using the prediction information.
92. A method according to claim 91 , comprising performing spatial and/or temporal prediction.
93. Video processing apparatus adapted to perform a method as claimed in any of claims 60 to 92.
94. A computer program or computer program product comprising software code adapted, when executed on a data processing apparatus, to perform a method as claimed in any of claims 1 to 29 and 60 to 92.
PCT/GB2009/050450 2008-04-30 2009-04-30 Television system WO2009133403A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
GB0807872.7 2008-04-30
GB0807872A GB2459684A (en) 2008-04-30 2008-04-30 Television Signal having high frame rate

Publications (2)

Publication Number Publication Date
WO2009133403A2 true WO2009133403A2 (en) 2009-11-05
WO2009133403A3 WO2009133403A3 (en) 2010-01-07

Family

ID=39522818

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/GB2009/050450 WO2009133403A2 (en) 2008-04-30 2009-04-30 Television system

Country Status (2)

Country Link
GB (1) GB2459684A (en)
WO (1) WO2009133403A2 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016038358A1 (en) * 2014-09-10 2016-03-17 Vidcheck Limited Method of adjusting video to minimise or remove pse triggers
EP3065391A1 (en) 2015-03-06 2016-09-07 The Trustees for the Time Being of Junior Barnes Family Trust An image capturing system and method for imaging cyclically moving objects
US10015412B2 (en) 2016-09-06 2018-07-03 The Trustees For The Time Being Of Junior Barnes Family Trust Video capturing system and method for imaging cyclically moving objects
WO2018198914A1 (en) * 2017-04-24 2018-11-01 Sony Corporation Transmission apparatus, transmission method, reception apparatus, and reception method
US11350115B2 (en) 2017-06-19 2022-05-31 Saturn Licensing Llc Transmitting apparatus, transmitting method, receiving apparatus, and receiving method

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11284133B2 (en) 2012-07-10 2022-03-22 Avago Technologies International Sales Pte. Limited Real-time video coding system of multiple temporally scaled video and of multiple profile and standards based on shared video coding information

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5625412A (en) * 1995-07-13 1997-04-29 Vision Research High-frame rate image acquisition and motion analysis system
US20020000994A1 (en) * 2000-04-14 2002-01-03 Neil Bergstrom System and method for superframe dithering in a liquid crystal display
EP1223549A1 (en) * 1999-10-04 2002-07-17 Hamamatsu Photonics K.K. Camera system for high-speed image processing
US6661463B1 (en) * 1983-05-09 2003-12-09 David Michael Geshwind Methods and devices for time-varying selection and arrangement of data points with particular application to the creation of NTSC-compatible HDTV signals
US20040051793A1 (en) * 2002-09-18 2004-03-18 Tecu Kirk S. Imaging device
US20040062305A1 (en) * 2002-10-01 2004-04-01 Dambrackas William A. Video compression system
US20060233438A1 (en) * 2005-04-14 2006-10-19 Samsung Electronics Co., Ltd. Methods and systems for video processing using super dithering
US7143432B1 (en) * 1999-10-01 2006-11-28 Vidiator Enterprises Inc. System for transforming streaming video data

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2175768B (en) * 1985-05-23 1989-04-05 Gec Avionics Television camera arrangement
US5528295A (en) * 1994-04-28 1996-06-18 Martin Marietta Corp. Color television camera using tunable optical filters
JP4337505B2 (en) * 2003-10-31 2009-09-30 ソニー株式会社 Imaging apparatus and imaging method, image processing apparatus and image processing method, image display system, recording medium, and program
EP1761058B1 (en) * 2005-08-31 2009-12-16 Luc Van Quickelberge Method and device for reproducing at a different rate that from the recording
CN101529890B (en) * 2006-10-24 2011-11-30 索尼株式会社 Imaging device and reproduction control device

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6661463B1 (en) * 1983-05-09 2003-12-09 David Michael Geshwind Methods and devices for time-varying selection and arrangement of data points with particular application to the creation of NTSC-compatible HDTV signals
US5625412A (en) * 1995-07-13 1997-04-29 Vision Research High-frame rate image acquisition and motion analysis system
US7143432B1 (en) * 1999-10-01 2006-11-28 Vidiator Enterprises Inc. System for transforming streaming video data
EP1223549A1 (en) * 1999-10-04 2002-07-17 Hamamatsu Photonics K.K. Camera system for high-speed image processing
US20020000994A1 (en) * 2000-04-14 2002-01-03 Neil Bergstrom System and method for superframe dithering in a liquid crystal display
US20040051793A1 (en) * 2002-09-18 2004-03-18 Tecu Kirk S. Imaging device
US20040062305A1 (en) * 2002-10-01 2004-04-01 Dambrackas William A. Video compression system
US20060233438A1 (en) * 2005-04-14 2006-10-19 Samsung Electronics Co., Ltd. Methods and systems for video processing using super dithering

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016038358A1 (en) * 2014-09-10 2016-03-17 Vidcheck Limited Method of adjusting video to minimise or remove pse triggers
US10911797B2 (en) 2014-09-10 2021-02-02 Telestream Uk Ltd Method of adjusting video to minimise or remove PSE triggers
EP3065391A1 (en) 2015-03-06 2016-09-07 The Trustees for the Time Being of Junior Barnes Family Trust An image capturing system and method for imaging cyclically moving objects
US10015412B2 (en) 2016-09-06 2018-07-03 The Trustees For The Time Being Of Junior Barnes Family Trust Video capturing system and method for imaging cyclically moving objects
WO2018198914A1 (en) * 2017-04-24 2018-11-01 Sony Corporation Transmission apparatus, transmission method, reception apparatus, and reception method
US11533522B2 (en) 2017-04-24 2022-12-20 Saturn Licensing Llc Transmission apparatus, transmission method, reception apparatus, and reception method
US11350115B2 (en) 2017-06-19 2022-05-31 Saturn Licensing Llc Transmitting apparatus, transmitting method, receiving apparatus, and receiving method
US11895309B2 (en) 2017-06-19 2024-02-06 Saturn Licensing Llc Transmitting apparatus, transmitting method, receiving apparatus, and receiving method

Also Published As

Publication number Publication date
GB2459684A (en) 2009-11-04
WO2009133403A3 (en) 2010-01-07
GB0807872D0 (en) 2008-06-04

Similar Documents

Publication Publication Date Title
US10013746B2 (en) High dynamic range video tone mapping
JP4352105B2 (en) Advanced television with enhanced temporal and resolution stratification
RU2718159C1 (en) High-precision upsampling under scalable encoding video images with high bit depth
US9554132B2 (en) Video compression implementing resolution tradeoffs and optimization
US7953315B2 (en) Adaptive video processing circuitry and player using sub-frame metadata
EP0711487B1 (en) A method for specifying a video window's boundary coordinates to partition a video signal and compress its components
US6728317B1 (en) Moving image compression quality enhancement using displacement filters with negative lobes
KR101662696B1 (en) Method and system for content delivery
Motra et al. An adaptive logluv transform for high dynamic range video compression
EP2144432A1 (en) Adaptive color format conversion and deconversion
US10701359B2 (en) Real-time content-adaptive perceptual quantizer for high dynamic range images
US11076172B2 (en) Region-of-interest encoding enhancements for variable-bitrate compression
JP2002517176A (en) Method and apparatus for encoding and decoding digital motion video signals
JPH08237669A (en) Picture signal processor, picture signal processing method and picture signal decoder
WO2009133403A2 (en) Television system
KR20150068402A (en) Video compression method
EP0711486B1 (en) High resolution digital screen recorder and method
US20020149696A1 (en) Method for presenting improved motion image sequences
KR20030005219A (en) Apparatus and method for providing a usefulness metric based on coding information for video enhancement
Bouzidi et al. On the selection of residual formula for HDR video coding
Basavaraju et al. Modified pre and post processing methods for optimizing and improving the quality of VP8 video codec
Seeling et al. Introduction to Digital Video
Young Preprocessing for digital video using mathematical morphology
Menon et al. On the dependency between compression and demosaicing in digital cinema
Maaroof et al. H264 Video Compression Technique with Retinex Enhancement

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 09738433

Country of ref document: EP

Kind code of ref document: A2

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 09738433

Country of ref document: EP

Kind code of ref document: A2